March 29, 2018

How does SoS compare with other workflow engines

Over 200 workflow systems have been developed to date. Like any other software tools, many workflow systems are actively evolving with new features added from time to time. The goal of this blog post is to illustrate, by means of comparison to some of the most popular workflow systems similar to SoS, features and limitations of SoS as a conventional workflow system. It should be seen as a check-list of basic workflow features, in addition to the unique niche SoS places itself in the realm of workflow systems as explained in the next section and in other posts. Read more

December 15, 2017

SoS: a cure to pipelineitis

Because of the needs to use libraries and tools in different languages and to execute them on different systems such as computer clusters, bioinformaticians write a lot of scripts in different languages and face many challenges in developing, running, managing, sharing, and reproducing bioinformatic data analyses. Notably, Management of scripts: With increasing number of scripts, some in multiple versions, some shared among projects, some written for and executed on remote systems, it can be difficult to share data analyses with others, and reproduce prior data analyses at a later time. Read more

December 10, 2017

What's the big deal about backing SoS Notebook with a workflow engine?

After I announced the release of SoS Notebook as a third-party multi-language kernel for Jupyter, I was asked repeatedly (e.g. On HackerNews, AzureNotebooks, and in reviews to our manuscript) the following question: Why did not you use an existing multi-language notebooks (e.g. Apache Zeppelin and BeakerX) or contribute a multi-language feature to the core of Jupyter or JupyterLab? There were several technical (e.g. architecture of the current Jupyter core not suitable for multi-language support) and practical (e. Read more

November 29, 2017

SoS Notebook: one notebook, multiple languages

I started to use IPython, and then Jupyter more than ten years ago but despite of all the nice features, there were always something missing, something that prevented me from using it as my main working environment. Notably, Jupyter lacks Line by line execution of scripts: The basic execution units of Jupyter are cells so there is no easy way to execute, tweak, and re-execute pieces of the code for debugging purposes (StackOverflow Question, Jupyter Issue 1094). Read more

© Bo Peng, Ph.D. / MD Anderson Cancer Center All rights reserved