Donnerstag, 17. Januar 2013

PublicTransport 0.11 now in beta

Since alpha 2 there have been many crash and bug fixes and some visual polishing. There were some random crashes with script providers and some rare crashes like trying to expand a departure item, that currently gets removed (animated).

Stop IDs are now used by the applet instead of stop names, if available. The data engine distinguishes between names and IDs, scripts know whether or not an ID was given and can use different URLs then. For providers with slow web servers the timeouts were fixed. A global timeout of 60 seconds is now used for script execution and asynchronous network requests (synchronous requests are counted as script execution time). Previously there was a global ~10 seconds timeout, making the default timeout for network requests of 30 seconds senseless.

Getting additional data for multiple timetable items at once now works much faster, because all requests get started together, not one after the other.

Besides fixing bugs I did some visual polishing of the applet. All route items now use the same angle for the stop names. The halo drawn behind route stop names is no longer clipped and properly fades out. Removing of city names from route stop names is now also done automatically if the route data was received as additional data. There's now also some more space for the route item.
See the difference:
Applet in 0.11 alpha 2

Applet in 0.11 beta

The applet now also shows a link to an HTML version of the route document for HAFAS providers. Links found in JourneyNews strings can now be clicked in the applet to open them in a browser. If there was an error while trying to get the additional data the applet will show it:
Errors  while getting additional data are now shown in the applet

A busy widget gets shown while route data gets requested as additional data.

Another new thing is using snap scrolling from Plasma::ScrollWidget to always stop (kinetic) scrolling at the top of departure items. When expanding a departure item, it is tried to make it completely visible in it's expanded state. And the applet will now only allow one item at a time to be expanded (could be made configurable quite easily, but currently it is not).

TimetableMate also got some fixes, including a fix for a freeze in the background JavaScript parser. The parser now works again, making the script items in the projects dock and the function combobox in script tabs show functions correctly again. Completion was also improved with nicely formatted documentation of the script API. It now shows all function overloads. There is now also a link shown in the completion box to open the documentation in the documentation dock (similar to KDevelop).

Sonntag, 30. Dezember 2012

PublicTransport 0.11 alpha 2

There is another alpha version for PublicTransport (0.11 alpha 2). It includes many build, crash and other fixes, some small new features (eg. snap scrolling at the top of departure items in the applet), a polished TimetableMate UI and one bigger change: Provider plugins are now hosted on openDesktop.org, you can watch them on kde-files.org or using the download dialog in the applet configuration.
This is great for a number of reasons: Provider plugins can be updated independently of the data engine (ie. users can get fixes on the plugins very quickly), better performance because the data engine does not need to track all available plugins (only downloaded ones), users can add new plugins very easily (TimetableMate can do it), use of social features like comments, ratings for provider plugins, etc.
This fixes the long broken GHNS feature of the applet (libpublictransporthelper). I moved from newstuff.kde.org to an easy to create Synchrotron repository first (just add publictransport to synchrotron.git and add provider files to synchrotron-sources.git). This was a quick way to test it and it worked. But it misses most GHNS features like upload, ratings or descriptions. Therefore I moved to openDesktop.org.

All provider plugins are now available to download, also for some new GTFS providers. No provider plugins will be installed by default. Unused plugins from previous versions should be removed (eg. "make uninstall" or manually from /usr/share/kde4/apps/plasma_engine_publictransport/serviceProviders/ or a similiar path).

The engine was updated to never delete connected data sources, eg. a "ServiceProvider <id>" source was deleted when the provider was uninstalled. Instead now the sources get updated accordingly. This makes it possible to install a new provider plugin and directly use it. The cache is now also properly cleaned up from data for no longer installed providers.

All build problems (missing protobuf or pthread, not found "javascriptcompletiongeneric.h", etc.) should be fixed now. There were also some fixes for crashes that happened randomly in the data engine (bad synchronization when a job was aborted). In the runner these crashes happened much more often because of the many job aborts while typing. Getting route data for HAFAS plugins now works more reliably. The GTFS service/importer also saw some improvements and is now able to import more GTFS feeds.

Thanks for testing and a happy new year! :)

Montag, 3. Dezember 2012

PublicTransport 0.11 alpha available

Time for an update about the progress made in the PublicTransport project!
I just tagged version 0.11 alpha in GIT and updated the page on kde-look.org (with an installer script).
There are many changes and new features, in short:
Supported countries in Europe
(not shown: India, Japan, New Zealand, USA)
  • GTFS feeds are now supported along with GTFS-realtime
  • The Swiss public transport API is supported by a provider script
  • There is now a script API for provider scripts (eg. network requests can be started from scripts)
  • Stable data formats for HAFAS providers (with a shared script code base, removed most HTML layout dependency)
  • Scripts now run in their own threads using ThreadWeaver
  • Less network traffic, improved performance
  • More supported providers in more countries
  • Much improved TimetableMate (for creating and testing provider plugins)
  • A Plasma::Service for timetable data sources, can be used to request manual updates, additional data or earlier/later items
  • Another Plasma::Service to handle the GTFS database, eg. to import GTFS feeds
  • Popup maps for stop input fields (if supported by the provider)
  • Improved journey search and more details in journey view

Three new HAFAS providers have been added (since it is now very easy, see below): de_vbb, ie_eireann and no_dri (adding support for Norway and Ireland).

With the new GTFS providers the list of supported countries extends to (in alphabetical order):
Austria, Belgium, Denmark, France, Great Britain, Hungary, India, Ireland, Italy, Japan, New Zealand, Norway, Poland, Spain, Sweden, Switzerland, Ukraine, USA (some states).

Now some more details about what has changed, with quite a few screenshots:

Open Data - GTFS

GTFS (General Transit Feed Specification) is now supported. GTFS feeds can be downloaded and imported into an SQLite database using a new Plasma::Service.
GTFS providers can then be used offline, realtime data can be added while being online using GTFS-realtime. The database takes some disk space (around 50 - 500 MB depending on the area covered), but it works almost instantly.
Many new countries are now supported through GTFS: USA, Spain, France, Great Britain, Hungary, India, Japan, New Zealand, Ukraine, Poland. And there are many more GTFS feeds available that wait to be included by adding provider description files (*.pts) for them. TimetableMate can now also be used to create GTFS provider plugins.
GTFS feed import in progress
Provider information dialog showing a GTFS provider 
with it's current state

Open Data - Swiss public transport API

Beside the existing provider for switzerland (ch_sbb, uses the undocumented HAFAS API), there is now an open timetable data source for switzerland:  http://transport.opendata.ch (thanks to Mario Fux for the link). It works like the other scripted providers, but it has a simple API and it's documentation is publicly available. This of course makes it much easier to use such data. Hopefully more timetable data will be opened like this in the future.
The applet using transport.opendata.ch

New Provider Script API using QtScript

Kross is no longer used for script execution, instead QtScript gets now used. That made it easy to create an API for the scripts and to add debugging features to TimetableMate.
The new script API has been created to provide the scripts with some more possibilities. Scripts can now start network requests themselves using the network object, store some data using the storage object, publish found timetable items using the result object or use the helper object for some more functions, mostly for parsing. The helper object also includes functions to easily parse HTML without the need to think about the common pitfalls. A provider object includes the properties of the ServiceProviderData class, ie. data from the *.pts file describing the provider (name, author, version, homepage, etc.). Instead of using strings, scripts can now use enumerables eg. for vehicle types or provider features.
Decoding of documents downloaded using the network object is also done by the scripts using helper.decode(). The documents are available as QByteArray otherwise and can be read using a data stream: DataStream objects can be created by scripts, which wrap QDataStream and provide functions like readInt16(). QDataStream only provides operator >>, which cannot be used from QtScript.
Other script languages can still be used through Kross from inside QtScript, but this is untested.

Improvements for HAFAS Providers

For providers using HAFAS software (eg. de_db, ch_sbb, at_oebb, etc.), there is now a new flexible base script. It can be included using include() and is able to read some HAFAS formats like XML for departures/arrivals/route data and a binary format for journeys. Unsurprisingly the XML documents are much shorter than the HTML documents that were used before, parsing them is easier and more efficient and it is no longer dependent on the HTML layout of the provider web site. It also removes the need to write new parsers for each HTML layout of each supported HAFAS provider, because the XML format is always the same. With less data to be downloaded the engine is now also much faster for HAFAS providers. Stop suggestions are retrieved in JSON format, which now gets parsed using the builtin JavaScript function JSON.parse() for best performance.

A "HAFAS-XML" departure document of size 0.9 KB (uncompressed 4.2 KB) replaces an HTML document with the same information in it, but of size 22.6 KB (uncompressed 150 KB). This means that 25 times less data needs to be downloaded :)
The "HAFAS-binary" format for journeys is only 2 KB, the HTML version is 69 KB (both compressed), this means over 34 times less data to be downloaded.

Some providers also offer route data in an XML format, others in a Lynx text version or mobile/desktop HTML. Route data can now be downloaded later as additional timetable data using a new Plasma service for the departure data source. Unfortunately the URLs to the route data documents (traininfo.exe) are only available from HTML departure sources, therefore a mobile version of the HTML departure document gets parsed for the URLs and then cached for following additional data requests.
The applet got a new setting (in the "Advanced" tab), that controls when additional data should be requested. By default additional data gets requested when a departure gets unfolded for the first time. There is also the possibility to directly request all route data for all received departures. This makes sense when route data gets used in filters, ie. the "Via" or "Next Stop" filters. These filters cannot work correctly until the route data is available. Otherwise this setting should not be changed to not produce too much network traffic and waste CPU time.

The de_rmv script also uses the new HAFAS base script, but it uses another XML format (maybe only used by de_rmv). This XML format is somewhat longer but also includes route data. Actually the format used for de_rmv is the same that was already used in previous versions. But I ported the C++ code to QtScript (very easy with the qt.xml extension), it does no longer use an own provider type C++ class.

The it_cup2000 script is another special user of the HAFAS base script, it implements it's own parser functions for HTML departures/arrivals/stop suggestions.
Back to live is us_septa, also a provider using HAFAS software. Departures/arrivals are still only available as HTML, but journeys are available in the binary format.

Removed Providers

EFA also uses an undocumented API like HAFAS. EFA scripts parsing HTML that do not work any longer are removed for now. A base script class for EFA providers could be written just like it's done with HAFAS providers (base_hafas.js, base_hafas_timetable.js, etc.). TimetableMate can be used for this task. This would make it easy to add support for some more providers (mostly in germany).

The slovak provider sk_imhd was removed because of much changed HTML structure, fr_gares, sk_atlas were also removed because they no longer work, de_vvs is waiting for an EFA base script to be added again. If you miss one of these providers, you could use TimetableMate to fix the scripts, add yourself as author and send me the fixed script.

Improved Updates

The "Update Timetable" action of the applet works again. This is based on work in the engine and uses the "requestUpdate" operation of the timetable service.
The engine now stores two QDateTime values in timetable data sources: "nextAutomaticUpdate" and "minManualUpdateTime". Automatic updates are done by the engine at the given time. Before "minManualUpdateTime" has passed all (manual) update requests are rejected/blocked by the engine. The applet uses these values to disable the update action while the engine would reject the update and to show the next automatic update time to the user (in the tooltip of the bottom label).

The applet with disabled update action,
showing the next automatic update

Load Additional Timetable Data Later

If some timetable data is not available without more work (eg. downloading more documents) providers can now provide such data later as additional timetable data per departure/arrival. HAFAS providers use this for route data, which is not directly available in their departure XML-documents.
The applet by default requests additional data when a departure gets unfolded for the first time. It is also possible to never request additional data or to directly request additional data for all new departures.
The applet configuration showing additional data options

Marble Public Transport Stop Map

Marble gets now used to show public transport stops in a map (if Marble is available at build time). Such a map gets shown in a popup for StopLineEdit of libpublictransporthelper. Stops can be clicked in the map to use it's name in the StopLineEdit. Providers can now provide longitude/latitude values for each stop they read in getStopSuggestions(), these values are needed for the map to work. For providers that are able to get stops by geological position (eg. most HAFAS providers), the stops of the currently shown region of the map are automatically loaded.
The stop coordinates of the providers are now also used by the applet instead of using the osm engine. This makes the "Show in Map" action work reliably. The runner also uses this for the "run action" of stop suggestion results and shows found stops in Marble.

Applet configuration dialog showing a popup map

Improved Journey Search

Searching for journeys also got some new features. Favorite/recent journey searches with alias names can be used and quickly be executed using the "Quick Journey Search" button. To get earlier or later journeys one click is enough. And for HAFAS providers there are now more details, every single intermediate stop can be shown (hidden by default, showing only connecting stops).
New quick journey search button
with favorite/recent journey searches
Favorite/recent journey searches
can be edited in a dialog

Earlier and later journeys can easily be requested
More details for journeys of HAFAS providers,
showing all intermediate stops

 TimetableMate

Writing new provider scripts gets a lot easier with TimetableMate 0.3. It can now open multiple projects, debug scripts, got some docks (projects, variables, backtrace, breakpoints, console, output, web inspector, network monitor, test, documentation), there are multiple automatic tests, a project dashboard written in QML, a GTFS database viewer etc. There is also a new icon for it, a modified KDevelop-icon. TimetableMate now looks more like KDevelop.
The QScriptEngineDebugger class is not used, instead there is now a new script debugger to better integrate it into TimetableMate. It also allows other special handling for provider scripts.
The documentation shown inside TimetableMate gets generated at build time from some of the source documentation for the data engine.

TimetableMate with the new project dashboard
and the console/projects docks
TimetableMate while debugging,
 showing current variables and some produced output
The new testing features of TimetableMate


Versioning

All this will be available in the next version 0.11. If you wonder what happened to 0.10, it silently got a release tag in GIT, but the provider scripts were mostly no longer working (changed HTML layouts, etc.). I already had new working scripts, but they depended on the new script API and porting them back was not easily possible. But 0.11 will be more resistant to such errors.
Previously I planned to release GTFS in version 0.12, but since GTFS already works quite well and testers are needed, there is no reason to delay it further. Especially because provider types can be disabled at build time now. A 0.12 build without GTFS would have been the same as a 0.11 build.


Test it!


It will still have some bugs, but nothing too big. Most provider plugins should work as expected. All features should also work when supported by the used provider. Please help to make 0.11 stable :)

Samstag, 29. Oktober 2011

Plasma PublicTransport

Public Transport Data in a KDE Plasma Desktop

I'm Friedrich Pülz and this is my first blog ;) To introduce myself: I'm studying informatics in Bremen, Germany and I'm the author of the PublicTransport project among others (eg. KrossWordPuzzle game, Glucose plasma applet for diabetics).

The PublicTransport applet in action for a german stop

Motivation 
 
I started the PublicTransport project (a single Plasma applet at that time) because I was annoyed at using websites of public transport service providers. To get a list of journeys you have to open up a browser, navigate to the service provider's website, type in the origin and target stop names and finally you can see the results. The PublicTransport applet simplifies this, it eg. stores the name of your home stop and sits in your desktop or panel. You only have to type in the target stop name to view a journey list (or use a favorite journey search, see below). A departure or arrival board for your home stop is shown by default, as can be seen in the screenshot above.
This brings possibilities for interesting features: Alarms, filters, favorite journeys, use of GPS to find near stops, themable design or eg. to show stop positions in a map application like Marble. It frees online timetable data from the browser.


PublicTransport Project
 
The project consists not only of a single applet now, but also two more applets, two data engines, a runner, a tool to add support for new service providers (TimetableMate) and a helper library (shared between the applets, the runner and TimetableMate). The data engines are named publictransport (to get timetable data) and openstreetmap (used in conjunction with the geolocation data engine to get public transport stops near the user). The other two applets are Flights (only shows flight departures with status) and GraphicalTimetableLine (showing vehicles moving on a street, see below).

I'll now give an overview of the current state of the project.


PublicTransport Data Engine

The PublicTransport data engine provides timetable data from different service providers. It uses "accessors" to get timetable data from service providers.
Currently there is one main type of accessors, that downloads documents and parses them using a script (in JavaScript, Ruby or Python, using Kross). Another type parses XML files, but is only used by one accessor (de_rmv). A new type is currently being developed in a feature branch in the git repository: GTFS. This new type will be included in version 0.11 (which maybe gets 1.0). It imports GTFS feeds into a local database (therefore it will work offline). Adding GTFS feeds of new service providers will be very easy.
There are currently 21 accessors available for Germany, Italy, Czech Republic, Switzerland, Austria, Belgium, Denmark, France, Poland, Slovakia, Sweden and the USA. Not all of them cover a whole country and there are multiple accessors for single countries. But at least for Germany, Switzerland, Austria and Poland it can be used for all public transport stops and train stations. Flight departures/arrivals are retrieved using flightstats.com for flights all over the world.
Writing new scripted accessors isn't too hard, I've even created a tool for that task (TimetableMate). Complete information about it is available in the documentation of the data engine (eg. at http://publictransport.horizon-host.com/doc/engine/0.10/page_accessor_infos.html). It needs an XML file with information about the service provider, urls to download departure documents from and a script to parse them. For parsing specially named functions in the script get called with the downloaded timetable documents (mostly HTML).

The data engine has multiple data sources, eg. "Service Providers" to give information about installed service providers or "Locations" to give information about all supported countries. The more interesting data sources are "Departures ...", "Arrivals ..."/"Journeys ..." and "Stops ..." (for stop suggestions). These data sources need more information like a stop name to get departures for, so a complete source name to get departures from "Bremen Hbf" using the service provider "de_db" looks like this: "Departures de_db|stop=Bremen Hbf". The service provider can be left away, the data engine then uses the default service provider for the users current country.
The data engine tries to get as much information from the timetable document as possible like delays, delay reasons, news, routes, platforms, operators, vehicle types, ...

All service providers in the data engine are tested with unit tests, to be able to quickly update accessors which service provider decided to change the layout of their website. Error messages from parser scripts get logged (with the HTML code where parsing failed), which makes it easier to identify and fix problems (eg. ~/.kde/share/apps/plasma_engine_publictransport/accessors.log).


PublicTransport Helper Library

This library can be used by applets, runners or normal applications which use the PublicTransport data engine. It offers classes for filters, an enumeration for vehicles types and flexible widgets/dialogs to configure stop settings/filters.
Originally the code was developed inside the publictransport applet, but since there is now also a runner, I put that code into a separate library and made it more flexible. It now also gets used by TimetableMate.
The library has good documentation for almost everything and should help people that want to write another timetable applet or runner using the data engine (or maybe a wallpaper plugin which shows a bus stop with buses coming and going like they do in reality ;)).


PublicTransport Applet

The applet shows departure/arrival boards for configured stops. You can also use it to search for journeys. It has some advanced features like filters and alarms. Departures can be filtered by multiple constraints, eg. a filter can be created to only show buses that go via a given stop. Alarms use the same filter classes to filter out the departures to create alarms for.
If enough data is available, the applet can show delays, news about the departures and stops on the route.
To make it easy to distinguish departures in the list, they are grouped by direction automatically. These groups are visualized by background colors. Each group can be turned off to filter out it's departures.
The appearance of the applet is very flexible, eg. it's contents can be made very big (with big fonts/icons), which is useful if the applet is used like a big display panel.
A screenshot can be seen on top of this post. 
In the next version (0.10) journey searches that are used often, can be set as a favorite journey search with a meaningful name. These journey searches can be executed with one click.


GraphicalTimetableLine Applet

This applet shows departures as vehicle icons moving on a street with nice animations.


GraphicalTimetableLine applet in action for a german stop with a tooltip


PublicTransport Runner

The runner can show departures, arrivals, journeys and stop suggestions using a simple query syntax, eg. "Departures Bremen Hbf" to show departures from "Bremen Hbf". It automatically uses the default accessor for the users country and can directly be used.

The runner in action with a custom german keyword for "departures"

TimetableMate

This is a little IDE that helps adding support for new service providers. It offers syntax completion, syntax error checking, complete script checking with sample data, an embedded web viewer (using KWebKit), GUI for all accessor settings (name, author, template urls, changelog, ...), installation of new accessors.

TimetableMate with syntax completion


Problems

The biggest problem is getting the timetable data. Mostly HTML documents need to get parsed, which of course isn't very nice. For some service providers there are better alternatives at least for stop suggestions (eg. JSON). For one service provider an XML source is used to get departures/arrivals (de_rmv). I talked with "Deutsche Bahn" (de_db) about that, but for now I only got a half-closed interface to get journeys (but with much less information than when using the HTML source).
This will be resolved partly with the coming GTFS accessor type, which can be used for many new service providers (with ~10 lines of XML for each accessor). But here in Germany for example, there is no publicly available GTFS feed.


Future
  • Better support for GHNS to download new accessors. This gets more important with the many new GTFS accessors.
  • Finish GTFS support
  • New parser for journey search strings


Resources