@article{3070800, title = "Evaluating Geospatial RDF Stores Using the Benchmark Geographica 2", author = "Ioannidis, T. and Garbis, G. and Kyzirakos, K. and Bereta, K. and Koubarakis, M.", journal = "Journal on Data Semantics", year = "2021", volume = "10", number = "3-4", pages = "189-228", publisher = "Springer Science and Business Media Deutschland GmbH", issn = "1861-2032, 1861-2040", doi = "10.1007/s13740-021-00118-x", keywords = "Digital storage; Geometry; Relational database systems; Scalability; Semantics; Statistical tests, Complex geometries; Geospatial applications; Geospatial features; Logical consequences; Query response time; Real-world datasets; Synthetic workloads; Systematic evaluation, Semantic Web", abstract = "Geospatial extensions of SPARQL, like GeoSPARQL and stSPARQL, have been defined since 2007, and while several geospatial RDF stores have implemented a substantial part of these extensions, other stores limited their support mostly on point geometry features. A parallel process with the above was that RDF frameworks evolved in an interesting way by presenting a more mature set of geospatial features, such as GeoSPARQL support and including the latest indexing technologies. As a logical consequence, a shift in the use of RDF frameworks is to be expected, from base platforms that users extend to create more complete geospatial RDF stores, to attractive finished RDF solutions for many geospatial applications. Alongside with the ever-increasing size of linked geospatial data that semantic stores need to handle, all the above provided our group the motivation to improve our single-node systems benchmark Geographica, originally defined in 2013. Geographica 2 is more comprehensive, because it now includes new geospatial RDF stores and frameworks, big real-world datasets of many hundred million triples with up to 50 million features of complex geometries, new tests and queries that reveal the scalability of these systems. The augmented and revised real-world workload of Geographica 2 tests the efficiency of primitive spatial functions in RDF stores, their performance in the geocoding scenario against the new Census dataset in addition to many other real use case scenarios and finally includes computation of statistics for geospatial datasets. A more detailed and systematic evaluation is performed using the synthetic workload. The new scalability workload aims at discovering the limits of centralized geospatial RDF stores of various architectures. It employs a set of six well-balanced real-world datasets with highly complex geometries covering many European countries and compares three RDF stores in terms of storage space, bulk loading and query response time. In addition, a special version of the benchmark has been created for systems with limited geospatial functionality and two more systems of this category are introduced along the six systems of the main benchmark, all stressed against point-only subsets of the workloads. Three out of the eight systems use an RDBMS for the persistence layer, while some of them offer a variety of persistence options. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature." }