Ascama & Innovation Rotating Header Image

Progress on DocWrite


The design goal was to create an interface for an existing speech recognition engine that will allow an easy interface with a web service that is running now as DocWrite. This accepts speech from an iPhone and allows a professional on the go to manage the dictation in a secure way. At the same time the service needs to allow an aggressive scaling of the recognition engines.


My first step involved getting knowledgeable about speech recognition.  After having implemented the basics and being able to get a voice file recognized into a text file it became important  to setup the architecture of how to scale and to couple to existing and future services.

I choose the service queue approach with a state machine being the memory of the transaction. As the implementation was running on my local virtual machines, while the data resides on a cloud across the ocean I have focussed on how to spawn services in parallel so that latency didn’t have a impact on performance.


At this moment the local system is fully capable of recognizing voice, train a profile an manage all the things needed and scale aggressively  with very low CPU overhead of less than 1 % in standby and a few % in full load leaving all capacity for data conversion. So from that point of view I’m happy as it seems to be able to scale to the maximum needed recognizer engines. The power of the implementation is in its ability to dynamically increase the number of parallel processes.

Next step is moving the local VMware Fusion implementation to the VM instances running on Amazon’s cloud and work out the limitations of recognizer engines and work on future add-on services. However as soon as the Amazon implementation is stable, the service will go life at

I’ll keep you posted.

Comments are closed.