On the web app sentences by the same speaker are combined. In the JSON, you will have to connect them yourself when a speaker switches. It will be better if we can have JSON response similar to the web app