dimanche 28 juin 2015

How to handle nested objects in processing a JSON stream

I am working on a program where we need to process very large JSON file, so I would like to use a streaming event oriented reader (like jsonstreamingparser) so that we can avoid loading the entire structure into memory at one time. Something I'm concerned about though is the object structure that seems to be required to make this work. For example, say I'm writing a program like Evite to send out invitations to an activity, with a JSON structure like:

{  
  "title": "U2 Concert",  
  "location": "San Jose",  
  "attendees": [  
    {"email": "foo@bar.com"},  
    {"email": "baz@bar.com"}
  ],  
  "date": "July 4, 2015"  
}

What I would like to do is have a programming "event" that when the stream encounters a new attendee, sends out an invite email. But, I can't do that because the stream has not yet reached the date of the event.
Of course, given the example, it's fine to just read everything into memory - but my dataset has complex objects where the "attendees" attribute are, and there can be tens of thousands of them.

Another "solution" is to just mandate: you HAVE to put all the required "parent" attributes first, but that is what I'm trying to find a way around.

Any ideas?

Aucun commentaire:

Enregistrer un commentaire