- Difficulty level: easy
- Time need to lean: 15 minutes or less
- Key points:
- Passing data across kernels allows you to use the best tool of each language
- Magic
%get
get variable from another kernel - Magic
%put
put variables to another kernel - Magic
%with
execute the cell in another kernel with input and output variables
A SoS notebook can have multiple live kernels with SoS serving as the master kernel to all other kernels (called subkernels). As described in this tutorial, SoS can process input and output of subkernels without knowing what they actually do. However, if SoS knows what the kernels do via appropriate language modules, it provides much more powerful ways to communicate with the kernels, the most important of which is the exchange of variables among subkernels.
Before we get to the actual magics on how to exchange variables between kernels, it is helpful to understand that, SoS does not tranfer any variables among kernels, it creates independent homonymous variables of similar types that are native to the destination language. For example, if you have the following two variables
a = 1
b = c(1, 2)
in R and executes a magic
%get a b --from R
in a SoS cell, SoS actually execute the following statements, in the background, to create variables a
and b
in Python
a = 1
b = [1, 2]
As shown in the following figure, language modules try to choose the best method, sometimes in memory and sometimes via disk, to pass variables from one to another kernel, but all the complexity is hidden from you. Variables in different kernels are independent so that changing the value of variables a
or b
in one kernel will not affect the variable in another kernel. We also note that a
and b
are of different types in Python although they are of the same numeric
type in R
(a
is technically speaking an array of size 1). That is to say, SoS does not gurantee one to one correspondence between datatypes, and does not gurantee lossless data exchange.
The eastest way to get variable from another kernel is to use magic %get
. It accepts one or more variable names and an option --from
if you are not getting from the master SoS
kernel.
For example, with a variable data
defined in SoS,
you can %get
the variables in a R kernel as follows
The type of the data is numeric
because data
is a numeric list in Python
However, if the variable contains different types of data, for example integer and string,
It will be translated to a list in R
Similarly, you can get a data.frame
mtcars
from R in SoS, but an option --from R
is needed to specify the source kernel
The type of mtcars
in SoS (Python) is, not surprisingly, a Pandas DataFrame
You can also %get
variables from one subkernel in another subkernel. For example the following cell gets mtcars
from a Julia
kernel. As the warning message says, because Julia dataframe does not yet support row labels, mtcars
in Julia will not have row label.
If you really need such information for your analysis in Julia, you will have to transfer it separately,
Magic %put
is similar to %get
but it puts variable from the current kernel to another. It by default put variables to SoS but can put to another subkernel with option --to
.
For example, the following cell puts variable ncars
to SoS:
It is important to note here, that although the %put
magic is specified at the beginning of the cell (as required by SoS), it is actually executed after the cell is executed.
ncars
is available in SoS after the %put
magic
Similarly, you can put variables to another kernel using the --to
option:
and the variable df
will be available in R
Say during a Python-based data analysis procedure you are in need of a bunch of random numbers, and either you do not have Scipy installed or are more familiar with how R, you can call R as follows
Here the %with
magic is just a shortcut to
but %with R
magic will appear to be function-call like procedure without changing the cell kernel.
The %with
magic can also be used from a subkernel and calling statements in SoS or another subkernel. For example, the following cell calls the head
function of DataFrame
to get the first few rows of mtcars
, and return as data.frame
in R.
Compare to all other multi-language approaches such as Python's rpy2
, Julia's PyCall
, or MATLAB's python engine, it is important to note that all statements and datatypes in a SoS environment are native and therefore easier to work with, with the disadvantage that your analyses can only be executed in SoS notebook (not as a standalone script).