Use Pyodide to run Python in browsers - rendering medical DICOM files

2021.10.02@PyconTW, video

Grimmer

Outline

 Aim - Migrate parsing data part of my previous JavaScript DICOM Viewer (Web/Chrome extension) to Pyodide  

  • Motivation: the used third-party JavaScript DICOM parser library but it seems not maintained. The other JavaScript DICOM parser library might be too heavy to use.
  • Why Pyodide?

Comparison of visualization + parsing solutions

Pure JS (JavaScript) JS + Py Server Pure Py - Jupyter  JS + Pyodide
Pros - strong interactive
- high customizable UI
- easy distribution
- strong interactive
- high customizable UI
- extensibility (any Python package)
Fast prototyping &  experiment - same as Pure JS
- * scientific python stack supported & any pure Python package
- offload Python server CPU loading
Cons - front-end skills
- may lack some JS data libraries
- JS parsing library may have different implementations with Python ver.
- front-end skills
- heavier: more effort on data move, API and distribution
No Pros of Pure JS  - front-end skills
- need to learn Pyodide
- Pyodide is under development
- download size (30~150MB)
- not every Python package / function is supported
- loading time (~3s)
- Python in WebAssembly is slower (2~5 times), using numpy to speed up compensation

We can see why JS + Pyodide may benefit

Pyodide can & can't

Support

  • JavaScript access Python objects, methods, and functions
  • Python access JavaScript objects, methods, and functions
  • any pure wheel Python package
  • Pyodide Python package (numpy.js/numpy.data) pure Python or including C/C++/Cython extensions.
  • NumPy, Pandas, Matplotlib, SciPy, and scikit-learn etc

Not support yet

  • native Python web: urllib.request /http.client. Instead,  use JavaScript XMLHttpRequest/fetch in Python

Need effort

  • importing your own Python files is one by one
  • interactive debugger

Try Pyodide in a REPL directly in your browser (no installation needed).

An example to plot Taiwan covid19 vaccinated via serverless JupyterLite (Pyodide based)

It uses plotly, pandas, fileio, http fetch

How Pyodide works

## https://pyodide.org/en/stable/usage/quickstart.html#accessing-javascript-scope-from-python

import js

div = js.document.createElement("div")
div.innerHTML = "<h1>This element was created from Python</h1>"
js.document.body.prepend(div)

Even Python operate HTML DOM via JavaScript 

How to use Pyodide (v0.18.0, v0.18.1)

  1. download Pyodide main js from URL or *NPM
 <!-- <script src="pyodide/pyodide.js"></script> -->
 <!-- <script src="https://cdn.jsdelivr.net/pyodide/dev/full/pyodide.js"></script> -->
<script src="https://cdn.jsdelivr.net/pyodide/v0.18.1/full/pyodide.js"></script>

2. load Pyodide main WebAssembly build:

/* download followings 
   package.json, pyodide.asm.data, pyodide.asm,  distutils.js, distutils.data */
// use globalThis.pyodide or local variable 
const pyodide = await loadPyodide({ indexURL: baseURL + "pyodide/" });

3. get your Python scripts, either embed them in JavaScript string,

const pythonScript = `
    print("hello world")
    import micropip 
    await micropip.install('pydicom') # or self hosted wheel url
    import sys
    sys.version
`

or put them in external .py file(s), then fetch

// execute `npm install pyodide` first, then
const pyodide = await import("pyodide/pyodide.js");
// if omit "https://localhost://"" means fetching from same original
const url = 'python/hello_world.py'
const pythonCode = await (await fetch(url)).text();

4. (optional) load Pyodide python packages imported in your Python code

// it will download pillow.js pillow.data if `from PIL import Image` in your python code
await pyodide.loadPackagesFromImports(pythonCode);

5. run your Python code and you will hello world in your browser console, download pydicom from PyPI and get return value from the last line

const pythonVer = await pyodide.runPythonAsync(pythonCode);
console.log(pythonVer) // 3.9.5 (default, Sep 16 2021, 15:37:13)...

Pyodide is working as a Python-built-in interpreter/jupyter, once it's loaded, the added script can load previous memory

# test1.py
num = 1
num
# test2.py
print(num) # 1
num += 1 
num # 2
/* in js file */
const num = await pyodide.runPythonAsync(pythonCodeTest1);
console.log(num); // 1
/* in js file */
const num = await pyodide.runPythonAsync(pythonCodeTest2);
console.log(num); // 2

JavaScript access Python, ref

  • Python object types
    • immutable:
      • int, float, str, bool, None (->undefined) -> memory copied to JS data
      • tuple, bytes/bytearray -> PyProxy which forward request to Python target object
    • mutable: function, dict, set, your own class, etc -> PyProxy
  • access ways
    • get object/value from pyodide.runPythonAsync last line
    • call a PyProxy Python function, get object/value from a function's return. Then read/write it
    • directly access object (read/write)
      • global namespace: pyodide.globals.get('num') & pyodide.globals.set('num', 2)
      • custom namespace
const my_namespace = pyodide.globals.dict();
pyodide.runPython(`y = 4`, my_namespace);
console.log(my_namespace.get("y")); // ==> 4

Python access JavaScript

  • JavaScript Data types
    • primitive: number, string, boolean, undefined, null -> memory copied to Python immutable object (null/undefined are None, number->int/float)
    • object: built-in objects, ArrayBuffers, ArrayBuffer View (Int8Array) and your own class, etc ->  JsProxy
    • function ->  JsProxy
  • access ways
    • call a JsProxy JavaScript function, get object from a function's return. Then read/write it
    • direct access object:
      • global: import js
      • module scope:
# const my_js_module = {num: 3} // in JS
# pyodide.registerJsModule("my_js_module", my_js_module);

# in Python: read
from my_js_module import num
print(num) 

# write
import my_js_module
my_js_module.num = 10 

if an accessed data is not JsProxy & PyProxy, read/write is straightforward, same as the normal way in that language. Done by implicit conversion of Pyodide

JsProxy & PyProxy

in Python: use JsProxy

  • supported operations, e.g.
    • read/write: subscript x[y] (array/object) & obj dot notation is supported
    • proxy.new(...) = new X(...) 
  • deep conversion (copy): to_py. e.g. JS array [1,2,3] to Python list [1,2,3]
  • *pyodide.to_js
  • pyodide.create_proxy & destory()

in JavaScript: use PyProxy

  • supported operations, e.g.
    • read/write: dot notation is supported. Use proxy.get/set on list/dict
    • deep conversion (copy): toJs
  • use destroy() to avoid memory leak
  • *pyodide.toPy

 

print(obj) & console.log(obj) will auto trigger to_py()/toJs()

round trip: PythonObj -> PyProxy in JavaScript -> same PythonObj

const pyProxyObj = pyodide.runPython(`
    import sys
    sys.version
    class TestObject:
        num = 10
    def test_object_type(test_obj):
        print(type(test_obj)) # <class 'TestObject'>
        print(test_obj.num) # 10
        if test_obj is obj:
            print("same object")
    obj = TestObject()    
    obj
`);
console.log(pyProxyObj) // Proxy
const testFun = pyodide.globals.get("test_object_type")
testFun(pyProxyObj)
  const obj = {
      "num": 20
  }
  pyodide.runPython(`
      def echo_obj(obj):
          print(type(obj)) # <class 'pyodide.JsProxy'>
          return obj    
  `);
  const echoFun = pyodide.globals.get("echo_obj")
  const obj2 = echoFun(obj);
  console.log(obj2) // {num: 20}
  if (obj2 == obj) {
      console.log("same object2")
  }

round trip: JsObj -> JsProxy in Python -> same JsObj

Why PyProxy needs destroy(), ref

PyProxy needs the destroy method because even if all references to the PyProxy are removed, the PyProxy can hang around for a very long time because of the way the browser garbage collector work

 // in JavaScript
const test_fun = pyodide.globals.get('test_fun')
// count + 1 for the memory 
// in WebAssembly Python heap
const list = test_fun()

// so total reference count referring the 
// Python list in WASM memory is 1, 
// even not use list anymore
// need some way to decrease 1 

// Either 
// 1. explictly call list.destory()
// 2. Pyodide register destory() in FinalizationRegistry 
// when Browser finally recycles it 
list.destory()



## in python, 
def test_fun():
  # count + 1, allocate memory 
  # in WebAssembly heap
  list_ = [1,2,3]   
  # count -1 when leaving this fun
  return list_

JavaScript ArrayBuffer & Python Buffer 

What is DICOM 

Digital Imaging and Communications in Medicine, includes

  • file format (p10 image, video, DICOM-ECG...)
  • network protocol (picture archiving and communication system (PACS))
  • store information: study, series, patent info., orientation, DICOM RT (radiation therapy)...

Need to handle below DICOM tags about image:

  • ​transferSyntax: uncompressed, ​jpeg baseline, jpeg lossless ...
  • modality: CT, MR, CR (x-ray), US...
  • photometric: MONOCHROME1, RGB, YBR_FULL, PALETTE...
  • transforms / LUT:  VOI (e.g. window center + window width (level), Modality, Palette ...
  • bit_allocated, bites_stored, high bit, pixel_representation
  • planar. 0: RGBRGBRGB..., 1:RRR...GGG...BBB...

Features

  • View online DICOM files by clicking DICOM urls
  • View offline DICOM by dragging files onto Chrome, or use built-in file browser to select files.
  • In terminal, use https://www.npmjs.com/package/cli-open-dicom-with-chrome to open DICOM files via this extension.
  • Shortcut (ctrl+u/cmd+u) to open extension viewer page. Or click extension icon.
  • Support adjustable window center mode.
  • Support multi-frame, RGB & JPEG DICOM files
  • Support different plane views mode
  • Show basic DICOM information
  • Web & Chrome extension

Adjust window

center/ width via mouse move, 20~30+ fps

3 plane views, use slider to switch sections

The implementation 

  • Types: TypeScript (JavaScript superset) in React & Python type annotations
  • DICOM parser: use well-documented lib, Pydicom 
  • Flow:
    • In TypeScript, load Pyodide and dicom_parser.py to get class PyodideDicom constructor function
    • Read local files / fetch online files
    • In TypeScript, call PyodideDicom(buffer, jpegDecoder) and store the returned PyProxy object, in __init__, do
      • parse & store basic DICOM info. and pixel_data
      • use NumPy to do calculations and store results to RGBA 1D ndarry
    • In TypeScript, access dicomObj attributes to show DICOM info. and draw on Canvas. Reuse this object for several actions.
const jpegDecoder = {
  // baseline, jpeg2000, jpegls and this
  lossless: (bytes: PyProxyBuffer) => {
    const buffer = bytes.getBuffer() // PyBufferData
    const decoder = new jpegLossless.lossless.Decoder();
    const data = buffer.data; // Uint8Array
    const decoded = decoder.decode(data, data.byteOffset, data.byteLength);
    buffer.release()
    return decoded.buffer // ArrayBuffer
  },
}
const dicomObj = PyodideDicom(buffer, bufferList, jpegDecoder)
@dataclass
class PyodideDicom:
    jpeg_decoder: Any = None # JsProxy
    final_rgba_1d_ndarray: Optional[np.ndarray] = None
    def __init__(
        self,
        buffer: Any = None, # JSProxy of JS ArrayBuffer
        buffer_list: Any = None,
        jpeg_decoder: Any = None, 
    ):  # buffer.to_py() result in memoryview, ds is DICOM object
    	ds = self.get_pydicom_dataset_from_js_buffer(buffer.to_py()) 
    ## then go some pydicom parsing, after that, we call below function    
    def decompress_compressed_data(self, dicom_pixel_data: bytes): # dicom file's pixel_data
        jsobj = pyodide.create_proxy(dicom_pixel_data)
        uncompressed_jsproxy = jpeg_decoder.lossless( # ArrayBuffer's JsProxy
            jsobj,
        )
        jsobj.destroy()
        uncompressed = uncompressed_jsproxy.to_py() # memoryview 
        ## .. ignore the code using DICOM tags to detect it is uint16/int16/int8/uint8
        image: np.ndarray = np.frombuffer(uncompressed, dtype=np.uint16) 

Use Python class & inject JS function

In Typescirpt

In Python

def render_frame_to_rgba_1d(): # total: 0.09 ~ 0.009s 
    # do 
    # 1. (optional) decompress jpeg & apply_modality_lut (uncompressed alreayd done) ~ 0.05
    # 2. get ndarray max/min
    # 3. inverse color for MONOCHROME1
    # 4. normalization for 1) color bite to 8 bit 2) windoe center mode (possible np.clip)   
    # 5. convert to rgba_1d_image_array, cases:
    #    1. rgb_image1d: reshape to rgb_image2d first (~0.03s), then use 2. 
    #       using np.insert approach is 2~3 times slower
    #    2. rgb_image2d: similar to 3. just np.dstack((image2d, alpha)) different 
    #    3. grey_image2d #### 0.002s
    #    4. grey_image1d which use 3. function

# case 3 
def flatten_grey_image_to_rgba_1d_image_array(self, image: np.ndarray):
    alpha = np.full(image.shape, 255)
    stacked = np.dstack((image, image, image, alpha))
    image = stacked.flatten()
    image = image.astype("uint8")
    return image 
    
# old slow way 
def flatten_grey_image2d_to_rgba_1d_image_array_iterate_ndarray_way():
	# 2d -> 1d -> 1d *4 (each array value -copy-> R+G+B+A)    
    for i_row in range(0, height):
        for j_col in range(0, width):

  

Performance

numpy efficient usage is  faster than numpy some usage

A speed test result on OT-MONO2-8-hip.dcm on https://barre.dev/medical/samples/.

convert "grey_image2d" to "rgba_1d_image" numpy.ndarray

  • numpy + efficient usage (np.dstack) in Pyodide (0.02s) >
  • numpy.ndarray + manual iteration calculation in local python (0.65s) >>
  • numpy.ndarray + manual iteration calculation in Pyodide. (4s)

 

After instantiating, use Python object in TypeScript 

/* get PyProxy obj of a PyodideDicom Python instance */
const image = PyodideDicom(buffer, bufferList, decompressJPEG)

/* directly access Python object's attributes */
setModality(image.modality)


/* get ndarray as canvas suitable data */ 
// image.final_rgba_1d_ndarray <- will leak: pyodide/pyodide#1853. Fixed in main branch.
const ndarray = image.get_rgba_1d_ndarray() 
const buffer = (ndarray as PyProxyBuffer).getBuffer("u8clamped");
(ndarray as PyProxyBuffer).destroy();
const uncompressedData = buffer.data as Uint8ClampedArray

// draw on canvas
const imgData = new ImageData(uncompressedData, image.width, image.height);
const ctx = c.getContext("2d");
ctx.putImageData(imgData, 0, 0);
/* adjust windoe center/width & ask Python to update its ndarray */

// case1
const ndarray = image.render_frame_to_rgba_1d(newWindowCenter, newWindowWidth)

// case2
const ax_ndarray = image.render_axial_view.callKwargs({
  normalize_window_center: newWindowCenter, normalize_window_width: newWindowWidth
});

to Python Keyword argument

Another case about PyProxy, for-of is working if it is iterable, besides x.get(i)

const decompressJPEG = {
  lossless: (bytes: any) => {	
    console.log("first byte:", bytes[0], bytes.get(0)); // undefined, 255
    /* PyProxy of Python "bytes" can be accessable via get & for-of in JavaScript
       so the below line works, too, although the function requires a ArrayBuffer parameter
       below iterate each byte (& implicte copy to number at the same time) is slow, 
       so use getBuffer */
    // const decoded = decoder.decode(bytes);

    const buffer = bytes.getBuffer()
    const decoder = new jpegLossless.lossless.Decoder();
    const data = buffer.data;
    const decoded = decoder.decode(data, data, data.byteOffset, data.byteLength);
    buffer.release()
    return decoded.buffer
  },
}

Development note

  • Pyodide
    • Creating a Pyodide package (either pure Python or C/C++/Cython extension, the author suggest only add the previous one, pure Python can be loaded from URL)
  • Pydicom
    • Some DICOM miss some tags and fail to open
    • happens that opening a specific DICOM takes longer than JS.
  • Embedded Pydicom React Viewer
    • new ver. of the Chrome extension is published, bundle with Pyodide core 20Mb+NumPy 10MB. Original: 176MB. 
    • To enhance the experience of waiting for Pyodide loading, use d4c-queue to enqueue loading Pyodide & dragging files.
    • todo: use comlink, a web worker lib to speed up. Even if Python part only consumes some milliseconds, but it may still affect UI event update FPS. The FPS is Ok when mouse move + Python + canvas rerender. But fps obviously dropdown if Chrome console is opened (since It uses UI main thread too)

Thank you!

slido Q & A

hackmd

Summary

  • Pyodide is funny and might be useful. Evaluate it to fit your requirements.
  • Pyodide welcome contributions. I've been submitted small PRs and involved in discussions. A lot of things can be improved, e.g. TypeScript typing, web worker, debugger, Python module, loading time, OpenCV, etc. 
  • Embedded Pydicom React Viewer welcome contributions too.  
    • Disclaimer: this is not for clinical use.
Made with Slides.com