Data Science London Hackathon

On the weekend of October 5th, I participated in the Data Science London Hackathon for Smart Cities. This involved having access to a number of datasets of city based data from London. These datasets included things such as:

  • Car Parking Counts
  • Oyster Journeys
  • Incidents of Antisocial Behaviour

A couple of guys from work and myself made a team (TeamLYST) and decided to have a closer look at the antisocial behaviour dataset to see if we could make something interesting.

The data gave events that happen on a given day, for a given street for about a month. The events were lovingly given as:

  • Dog Fouling
  • Graffiti
  • AntiSocials (public urination, vomit, etc)

So from this we decided to make a predictive application that would generate a number of likely events to happen for a Monday, Tuesday, etc.

The application was split into 3 parts:

  1. Pre-processing the data into a format which was useful, adding in default values etc,
  2. Creating a generative predictive model from this data
  3. Visualising the data

There were three on our team, so I picked the visualisation. I did this using Python and PyGame to draw a PNG of London, which was generated by open streetmap. Event locations were translated to map locations, and the map could be translated and zoomed with the events staying where they were supposed to be. The visualiser allowed you to flip through different days and to access new generated events.

The generative model was trained by looking at each Monday, Tuesday, etc to work out a count of each event type per street, which was then normalised against the total events of that day. This gave a likelihood for each event in each street for each day in the week. Assuming that all events are equally likely to occur (a big assumption) we can sample a normal distribution and apply this to our likelihood map to generate an event. We do this the same amount as the average number of events for that day and we get a pseudo -typical event set.

The final product worked as intended, and with more accurate data could be extended into a nice predictive application to help with local law enforcement responses and distributions.

We didn’t win the hackathon, but it was a fun experience. We put up a video of our work too.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.