10 mars 2025Analyzing GTFS Realtime Data for Public Transport Insights In today’s post, we (that is, Gaspard Merten from Universite Libre de Bruxelles and yours truly) are going to dive deep into how to analyze public transport data, using both schedule and real time information. This collaboration has been made possible by the EMERALDS project. Previously, I already shared news about GTFS algorithms for Trajectools that add GTFS preprocessing tools (incl. Route, segment, and stop layer extraction) to the QGIS Processing toolbox. Today, we’ll discuss the aspect of handling realtime GTFS data and how we approach analytics that combine both data sources. About Realtime GTFS Many of us have come to rely on real-time public transport updates in apps like Google Maps. These apps are powered by standardized data formats that ensure different systems can communicate. Google first introduced GTFS in 2005, a format designed to organize transit schedules, stop locations, and other static transit information. Then, in 2011, they introduced GTFS Realtime (GTFS-RT), which added the capability to include live updates on vehicle positions, delays, speeds, and much more. However, as the name suggests, GTFS Realtime is all about live data. This means that while GTFS …
1 mars 2025Trajectools is moving to Codeberg The Trajectools repository is migrating from GitHub to Codeberg. The new home for Trajectools is: https://codeberg.org/movingpandas/trajectools The GitHub repo remains as a writable mirror, for now, but the issue tracking is only active on Codeberg. Why the move? I am working on moving my projects to European infrastructure that better aligns with my values. Codeberg is a nonprofit and libre-friendly platform based in Germany. This will ensure that the projects are hosted on infrastructure that prioritizes user privacy and open-source ideals. What does this mean for users? No impact on functionality – Trajectools remains the same great tool for trajectory analysis, available through the recently update QGIS Plugin Repo. Development continues – I’ll continue actively maintaining and improving the project. (If you want to file feature requests, please note that the issue tracker on the GitHub mirror has been deactivated and issues should be filed on Codeberg instead.) What does this mean for contributors? If you’re contributing to Trajectools, simply update your remotes to the new repository. The GitHub repo continues to accept PRs and the changes are synched between GitHub and Codeb …
31 janvier 2025Geocomputation with Python: now in print! Today, I’m super excited to share with you the announcement that our open source textbook “Geocomputation with Python” has finally arrived in print and is now available for purchase from Routledge.com, Amazon.com, Amazon.co.uk, and other booksellers. “Geocomputation with Python” (or geocompy for short) covers the entire range of standard GIS operations for both vector and raster data models. Each section and chapter builds on the previous. If you’re just starting out with Python to work with geographic data, we hope that the book will be an excellent place to start. Of course, you can still find the online version of the book at py.geocompx.org. The book is open-source and you can find the code on GitHub. This ensures that the content is reproducible, transparent, and accessible. It also lets you interact with the project by opening issues and submitting pull requests. …
11 janvier 2025Trajectools 2.4 release In this new release, you will find new algorithms, default output styles, and other usability improvements, in particular for working with public transport schedules in GTFS format, including: Added GTFS algorithms for extracting stops, fixes #43 Added default output styles for GTFS stops and segments c600060 Added Trajectory splitting at field value changes 286fdbd Added option to add selected fields to output trajectories layer, fixes #53 Improved UI of the split by observation gap algorithm, fixes #36 Note: To use this new version of Trajectools, please upgrade your installation of MovingPandas to >= 0.21.2, e.g. using import pip; pip.main([‘install’, ‘–upgrade’, ‘movingpandas’]) or conda install movingpandas==0.21.2 …
17 décembre 2024Urban mobility insights with MovingPandas & CARTO in Snowflake Today, I want to point out a blog post over at https://carto.com/blog/urban-mobility-insights-with-movingpandas-carto-in-snowflake written together with my fellow co-authors and EMERALDS project team member Argyrios Kyrgiazos. For the technically inclined, the highlight are the presented UDFs in Snowflake to process and transform the trajectory data. For example, here’s a TemporalSplitter UDF: CREATE OR REPLACE FUNCTION CARTO_DATABASE.CARTO.TemporalSplitter(geom ARRAY, t ARRAY, mode STRING) RETURNS ARRAY LANGUAGE PYTHON RUNTIME_VERSION = 3.11 PACKAGES = (‘numpy’,’pandas’, ‘geopandas’,’movingpandas’, ‘shapely’) HANDLER = ‘udf’ AS $$ import numpy as np import pandas as pd import geopandas as gpd import movingpandas as mpd import shapely from shapely.geometry import shape, mapping, Point, Polygon from shapely.validation import make_valid from datetime import datetime, timedelta def udf(geom, t, mode): valid_df = pd.DataFrame(geom, columns=[‘geometry’]) valid_df[‘t’] = pd.to_datetime(t) valid_df[‘geometry’] = valid_df[‘geometry’].apply(lambda x:shapely.wkt.loads(x)) gdf = gpd.GeoDataFrame(valid_df, geometry=’geometry’, crs=’epsg:4326′) gdf = gdf.set_index(‘t’) traj = mpd.Trajectory(gdf …
23 novembre 2024GeoParquet in QGIS – smaller & faster files for the win! tldr; Tired of working with large CSV files? Give GeoParquet a try! “Parquet is a powerful column-oriented data format, built from the ground up to as a modern alternative to CSV files.” https://geoparquet.org/ (Geo)Parquet is both smaller and faster than CSV. Additionally, (Geo)Parquet columns are typed. Text, numeric values, dates, geometries retain their data types. GeoParquet also stores CRS information and support in GIS solutions is growing. I’ll be giving a quick overview using AIS data in GeoPandas 1.0.1 (with pyarrow) and QGIS 3.38 (with GDAL 3.9.2). File size The example AIS dataset for this demo contains ~10 million rows with 22 columns. I’ve converted the original zipped CSV into GeoPackage and GeoParquet using GeoPandas to illustrate the huge difference in file size: ~470 MB for GeoParquet and zipped CSV, 1.6 GB for CSV, and a whopping 2.6 GB for GeoPackage: Reading performance Pandas and GeoPandas both support selective reading of files, i.e. we can specify the specific columns to be loaded. This does speed up reading, even from CSV files: Whole fileSelected columnsCSV27.9 s13.1 sGeopackage2min 12s 20.2 sGeoParquet7.2 s4.1 s Indeed, reading the whole GeoPackage is get …
4 novembre 2024GeoAI: key developments & insights It’s been a while since my post on geo and the AI hype in 2019. Back then, I didn’t use the term “GeoAI”, even though it has certainly been around for a while (including, e.g., with dedicated SIGSPATIAL workshops since 2017). GeoAI isn’t one single thing. It’s an umbrella term, including: “AI for Geo” (using AI methods in Geo, e.g. deep learning for object recognition in remote sensing images) and “Geo for AI” (integrating geographic concepts into AI models, e.g. by building spatially explicit models). [Zhang 2020] [Li et al. 2024] Today’s post is a collection of key GeoAI developments I’m aware of. If I missed anything you are excited about, please let me know here in the comments or over on Mastodon. Background A week ago, I had the pleasure to attend a “Specialist Meeting” on GeoAI here in Vienna, meeting over 40 researchers from around the world, from Master students to professor emeritus. Huge props to Jano (Prof. Krzysztof Janowicz) and his team at Uni Wien for bringing this awesome group of people together. The elephant in the room: LLMs Unsurprisingly, LLMs and the claims they make about geography are a mayor issue due to the mistakes they make and the biases behind them. A …
6 octobre 2024LLM-based spatial analysis assistants for QGIS After the initial ChatGPT hype in 2023 (when we saw the first LLM-backed QGIS plugins, e.g. QChatGPT and QGPT Agent), there has been a notable slump in new development. As far as I can tell, none of the early plugins are actively maintained anymore. They were nice tech demos but with limited utility. However, in the last month, I saw two new approaches for combining LLMs with QGIS that I want to share in this post: IntelliGeo plugin: generating PyQGIS scripts or graphical models At the QGIS User Conference in Bratislava, I had the pleasure to attend the “Large Language Models and GIS” workshop presented by Gustavo Garcia and Zehao Lu from the the University of Twente. There, they presented the IntelliGeo Plugin which enables the automatic generation of PyQGIS scripts and graphical models. The workshop was packed. After we installed all dependencies and the plugin, it was exciting to test the graphical model generation capabilities. During the workshop, we used OpenAI’s API but the readme also mentions support for Cohere. I was surprised to learn that even simple graphical models are actually pretty large files. This makes it very challenging to generate and/or modify models because …
21 septembre 2024Trajectools tutorial: trajectory preprocessing Today marks the release of Trajectools 2.3 which brings a new set of algorithms, including trajectory generalizing, cleaning, and smoothing. To give you a quick impression of what some of these algorithms would be useful for, this post introduces a trajectory preprocessing workflow that is quite general-purpose and can be adapted to many different datasets. We start out with the Geolife sample dataset which you can find in the Trajectools plugin directory’s sample_data subdirectory. This small dataset includes 5908 points forming 5 trajectories, based on the trajectory_id field: We first split our trajectories by observation gaps to ensure that there are no large gaps in our trajectories. Let’s make at cut at 15 minutes: This splits the original 5 trajectories into 11 trajectories: When we zoom, for example, to the two trajectories in the north western corner, we can see that the trajectories are pretty noisy and there’s even a spike / outlier at the western end: If we label the points with the corresponding speeds, we can see how unrealistic they are: over 300 km/h! Let’s remove outliers over 50 km/h: Better but not perfect: Let’s smooth the trajectories to get rid of more of the …
29 août 2024Building spatial analysis assistants using OpenAI’s Assistant API Earlier this year, I shared my experience using ChatGPT’s Data Analyst web interface for analyzing spatiotemporal data in the post “ChatGPT Data Analyst vs. Movement Data”. The Data Analyst web interface, while user-friendly, is not equipped to handle all types of spatial data tasks, particularly those involving more complex or large-scale datasets. Additionally, because the code is executed on a remote server, we’re limited to the libraries and tools available in that environment. I’ve often encountered situations where the Data Analyst simply doesn’t have access to the necessary libraries in its Python environment, which can be frustrating if you need specific GIS functionality. Today, we’ll therefore start to explore alternatives to ChatGPT’s Data Analyst Web Interface, specifically, the OpenAI Assistant API. Later, I plan to dive deeper into even more flexible approaches, like Langchain’s Pandas DataFrame Agents. We’ll explore these options using spatial analysis workflow, such as: Loading a zipped shapefile and investigate its content Finding the three largest cities in the dataset Selecting all cities in a region, e.g. in Scandinavia from the dataset Creating static and inter …
Besoin de conseils, assistance, formation?
Visitez le site de NASCA et découvrez toutes nos offres de coaching SIG