Subscribe to our Newsletter

Guest blog post by Vincent Granville

Theo: One idea is that you must purchase a number of transactions before using the paid service, and add dollars regularly. A transaction is a call to the API.

The service is accessed via an HTTP call that looks like

http://www.datashaping.com/AnalyticsAPI?clientID=xxx&dataSource=yyy&service=zzz&parameters=abc

When the request is executed,


  • First the script checks if client has enough credits (dollars)
  • If yes it fetches the data on the client web server: the URL for the source data is yyy
  • Then the script checks if source data is OK or invalid, or client server unreachable
  • Then it executes the service zzz, typically, a predictive scoring algorithm
  • The parameter field tells whether you train your predictor (data = training set) or whether you use it for actual predictive scoring (data outside the training set)
  • Then it processes data very fast (a few secs for 1MM observations for the training step)
  • Then it sends an email to client when done, with the location (on the datashaping server) of the results (the location can be specified in the API call, as an additional field, with a mechanism in place to prevent file collisions from happening)
  • Then it updates client budget

Note all of this can be performed without any human interaction. Retrieving the scored data can be done with a web robot, and then integrated into the client's database (again, automatically). Training the scores would be charged much more than scoring one observation outside the training set. Scoring one observation is a transaction, and could be charged as little as $0.0025.

This architecture is for daily or hourly processing, but could be used for real time if parameter is not set to "training". However, when designing the architecture, my idea was to process large batches of transactions, maybe 1MM at a time.
E-mail me when people leave their comments –

You need to be a member of Data Plumbing to add comments!

Join Data Plumbing

Webinar Series

Follow Us

@DataScienceCtrl | RSS Feeds

Data Science Jobs