While browsing StackOverflow I accidentally stumbled upon this question.
The issue seemed like a simple but interesting one, and since google had no answers I thought it would be fun to give it a shot. The first iteration of my solution involved reading in strings of indefinite length, but it quickly turned into a overly-complex state machine so I stepped back and rewrote it with one-character-at-a-time input. You can find on github, under JSONCharInputReader.
JSONCharInputReader was designed to read in a JSON array, not object.
In other words, this will work:
[1, 3, 4, {"var": "val"}, ["array_item", "array_item"], ... |
but the following will not:
{"key": ["array", 1, 2], "key2": "value", ... |
To parse a JSON stream, create a JSONCharInputReader object and pass in your own implementation of the JSONChunkProcessor interface. JSONChunkProcessor defines one function:
interface JSONChunkProcessor { public function process($jsonChunk); } |
Clients implementing this can expect $jsonChunk to be valid JSON (as long as the JSON data being read in is valid itself). If the above array was to be read in, the decoded $jsonChunks passed to the processor would be as follows:
// Decoding 1
int(1)
// Decoding 3
int(3)
// Decoding 4
int(4)
// Decoding {"var": "val"}
object(stdClass)#3 (1) {
["var"]=>
string(3) "val"
}
// Decoding ["array_item", "array_item"]
array(2) {
[0]=>
string(10) "array_item"
[1]=>
string(10) "array_item"
}
See example.php for a sample implementation of the reader, which can read JSON from the terminal by executing it with:
cat | php example.php |
One notable limitation is that the $jsonChunks passed to JSONChunkProcessor will be first-dimension level elements of the incoming JSON array data stream. In other words, large objects or arrays will only be processed once the reader receives all of their data.