项目作者: PyNIPT

项目描述 :
Python NeuroImage Pipeline Tool
高级语言: Python
项目地址: git://github.com/PyNIPT/pynipt.git
创建时间: 2019-02-16T21:48:24Z
项目社区:https://github.com/PyNIPT/pynipt

开源协议:GNU General Public License v3.0

下载


DOI

PyNIPT (Python NeuroImage Pipeline Tool)

Version: 0.2

Description:

  • The PyNIPT module is a pipeline framework for neuroimaging data analysis that offers a convenient, and yet powerful data processing management features under Jupyter Notebook environment. The module is designed to take input from the BIDS dataset and organize the derivates into the block of steps directories instead of using prefix or suffix to modify the filename. Therefore it preserves the original filename during the data processing while the derivates of each pipeline node organized into a single directory.
  • The key features of this module are

    1. Enabling to execute the command-line interface or python script without input file path specification. The selection of the set of files could be performed via selecting node block and using the regex (regular expression) pattern of the file name.
    2. Continuity of data processing in a single Jupyter notebook session. The module executes the command through background scheduler, so that the Jupyter notebook does not block during data processing.
    3. Providing simple code-based access point for any derived intermediate dataset to use as input of processing, analysis, or visualization node.
    4. Providing the bottom-up style of pipeline development API. The easy-to-use debugging tool also maximizes the convenience of development.
  • Dependency:

    • pandas >= 1.0.0
    • tqdm >= 4.40.0
    • psutil >= 5.5.0
    • paralexe >= 0.1.0
    • shleeh >= 0.0.6
  • Compatibility:

  • ChangeLog:

    • v0.2.1 (5/24/2020) - critical bug patch
    • v0.2.0 (5/24/2020) - user interface for debugging

Installation

  • from PyPI

    1. $ pip install pynipt
  • from Github repository (nightly build) if you have any issues with PyPI version, please try this instead

    1. $ pip install git+https://github.com/pynipt/pynipt

Example Project Data Structure

  1. Project_Root/
  2. ├── JupyterNotes/
  3. └── fMRI_Data_Preprocessing.ipynb
  4. ├── Data/
  5. ├── dataset_description.json
  6. ├── README
  7. ├── sub-01/
  8. ├── anat/
  9. ├── sub-01_T2w.json
  10. └── sub-01_T2w.nii.gz
  11. ├── fmap/
  12. ├── sub-01_fieldmap.json
  13. ├── sub-01_fieldmap.nii.gz
  14. └── sub-01_magnitude.nii.gz
  15. └── func/
  16. ├── sub-01_task-rest_bold.json
  17. ├── sub-01_task-rest_bold.nii.gz
  18. ├── sub-01_task-active_bold.json
  19. └── sub-01_task-active_bold.nii.gz
  20. └── sub-02/
  21. ├── anat/
  22. ├── sub-02_T2w.json
  23. └── sub-02_T2w.nii.gz
  24. ├── fmap/
  25. ├── sub-02_fieldmap.json
  26. ├── sub-02_fieldmap.nii.gz
  27. └── sub-02_magnitude.nii.gz
  28. └── func/
  29. ├── sub-02_task-rest_bold.json
  30. ├── sub-02_task-rest_bold.nii.gz
  31. ├── sub-02_task-active_bold.json
  32. └── sub-02_task-active_bold.nii.gz
  33. ├── Mask/
  34. ├── 02A_BrainMasks-func/
  35. ├── sub-01/
  36. ├── sub-01_task-rest_bold.nii.gz
  37. └── sub-01_task-active_bold_mask.nii.gz
  38. └── sub-02/
  39. ├── sub-02_task-rest_bold.nii.gz
  40. └── sub-02_task-active_bold_mask.nii.gz
  41. └── 02B_BrainMasks-anat/
  42. ├── sub-01/
  43. ├── sub-01_T2w.nii.gz
  44. └── sub-01_T2w_mask.nii.gz
  45. └── sub-02/
  46. ├── sub-02_T2w.nii.gz
  47. └── sub-02_T2w_mask.nii.gz
  48. ├── Processing/
  49. └── MyPipeline/
  50. ├── 01A_ProcessingStep1A-func/
  51. ├── sub-01/
  52. ├── sub-01_task-rest_bold.nii.gz
  53. └── sub-01_task-active_bold.nii.gz
  54. ├── sub-02/
  55. ├── sub-02_task-rest_bold.nii.gz
  56. └── sub-02_task-active_bold.nii.gz
  57. └── 01B_ProcessingStep1B-func/
  58. ├── sub-01/
  59. ├── sub-01_task-rest_bold.nii.gz
  60. └── sub-01_task-active_bold.nii.gz
  61. └── sub-02/
  62. ├── sub-02_task-rest_bold.nii.gz
  63. └── sub-02_task-active_bold.nii.gz
  64. ├── Results/
  65. └── MyPipeline/
  66. └── 030_2ndLevelStatistic-func/
  67. ├── TTest.nii.gz
  68. └── TTest_report.html
  69. ├── Temp/
  70. ├── Logs/
  71. ├── DEBUG.log
  72. ├── STDERR.log
  73. └── STDOUT.log
  74. └── Templates/
  75. └── BrainTemplate.nii.gz

A Project folder is composed of 6 data components as below

  • Data: naive BIDS dataset (This is the only required folder when you start)
  • Mask: the folder to store outputs for single subject-level image segmentation results (such as brain mask), which may manual refinement could be required.
  • Processing: the folder to store intermediate files that generated by this module, it could be used as input for later processing nodes.
  • Results: the folder to store the report files that does not preserve original data structure, such as group-level analysis.
  • Temp: the folder to store the intermediate files that can be disposed without worry. (which means not important to keep for the further process)
  • Log: the folder to keep log files. All logging message including debugging, standard output, and error messages from sub-processors will be recorded here.

    Optional data components can be used as below (up to you)

  • JupyterNotes: the folder to store Jupyter notebook. Source code, Documentation and Visualizing outputs.
  • Templates: to store anatomical brain image template, reference images, group level brain masks, and labelled brain atlas.

Getting started

  • Start Pipeline from scratch
    ```python

    import pynipt as pn
    pipe = pn.Pipeline()
    ** Dataset summary

Path of Dataset: /absolute/path/to/
Name of Dataset:
Selected DataClass: Data

Subject(s): [‘sub-01’, ‘sub-02’]
Datatype(s): [‘anat’, ‘fmap’, ‘func’]

List of installed pipeline packages:

pipe.set_scratch_package(‘MyPipeline’)
The scratch package [MyPipeline] is initiated.
```

  • Execute linux shell command ‘mycommand’ for the all file in Datatype ‘func’ and output files to Processing/01A_ProcessingStep1A-func.

    1. >> itb = pipe.get_builder(n_threads=1) # in case mycommand take huge computing resources
    2. >> itb.init_step(title='ProcessingStep1A', suffix='func',
    3. >> idx=1, subcode='A', mode='processing', type='cmd')
    4. >> itb.set_input(label='input', input_path='func')
    5. >> itb.set_var(label='param', value=10)
    6. >> itb.set_output(label='output')
    7. >> itb.set_cmd('mycommand -i *[input] -o *[output] -o *[param]')
    8. >> itb.set_output_checker('output')
    9. >> itb.run()
  • Execute linux shell command ‘getmask’ for the first file in Datatype ‘01A’ to generate brain mask and output to Mask/02A_BrainMasks-func.
    additionally copy the original data to output folder

    1. >> itb = pipe.get_builder(n_threads=4) # multi threasing
    2. >> itb.init_step(title='BrainMasks', suffix='func',
    3. >> idx=2, subcode='A', mode='masking', type='cmd')
    4. >> itb.set_input(label='input', input_path='01A', idx=0,
    5. >> filter_dict=filter_dict)
    6. >> itb.set_output(label='mask', suffix='_mask')
    7. >> itb.set_output(label='copy')
    8. >> itb.set_cmd('getmask *[input] *[mask]')
    9. >> itb.set_cmd('cp *[input] *[copy]')
    10. >> itb.set_output_checker('mask')
    11. >> itb.run()
  • Execute python function ‘myfunction’ for the all file in StepCode ‘01A’ and output files to Processing/01B_ProcessingStep1B-func.
    ```python

    def myfunction(input, output, param, stdout=None, stderr=None):

    1. ## import modules here
    2. import sys
    3. import numpy as np
    4. import nibabel as nib
  1. if stdout == None: # for handling output/error messages
  2. stdout = sys.stdout
  3. stderr = sys.stderr
  4. try:
  5. # put your code here
  6. stdout.write(f'Running MyFunction for input: {input}\n')
  7. img = nib.load(input)
  8. img_data = np.asarray(img.dataobj)
  9. result_data = img_data * param
  10. stdout.write(f'Multiply image by {param}\n')
  11. nii = nib.Nifti1Image(result_data, affine=img._affine, header=img._header)
  12. stdout.write(f'Save to {output}..\n')
  13. nii.to_filename(output)
  14. stdout.write('Done\n')
  15. # until here
  16. except: # for handling error
  17. stderr.write('[ERROR] Failed.\n')
  18. import traceback
  19. traceback.print_exception(*sys.exc_info(), file=stderr)
  20. return 1
  21. return 0

itb = pipe.get_builder()
itb.init_step(title=’ProcessingStep1B’, suffix=’func’,
idx=1, subcode=’B’, mode=’processing’, type=’python’)
itb.set_input(label=’input’, input_path=’01A’) # the data from ‘01A’ will be assigned as ‘input’ argument of ‘myfunction’
itb.set_var(label=’param’, value=10) # 10 will be assigned ‘param’ argument of ‘myfunction’
itb.set_output(label=’output’) # no modification on filename, so output filename will be same as input
itb.set_func(myfunction)
itb.set_output_checker(‘output’)
itb.run()
```

  • Simple example of 2nd level stats (TTest), ‘rest’ vs ‘active’

    1. >> def myttestfuncion(group_a, group_b, output, stdout=None, stderr=None):
    2. >> import sys
    3. >> if stdout is None:
    4. >> stdout = sys.stdout
    5. >> if stderr is None:
    6. >> stderr = sys.stderr
    7. >>
    8. >> import nibabel as nib
    9. >> import numpy as np
    10. >> import scipy.stats as stats
    11. >> affine = None
    12. >> try:
    13. >> groups = dict()
    14. >> for group_id, subj_list in dict(group_a=group_a, group_b=group_b).items():
    15. >> stack = []
    16. >> for i, img_path in enumerate(subj_list):
    17. >> stdout.write(f'{img_path} is loaded as group {group_id}')
    18. >> img = nib.load(img_path)
    19. >> if i == 0:
    20. >> affine = img.affine
    21. >> stack.append(np.asarray(img._dataobj))
    22. >> groups[group_id] = np.concatenate(stack, axis=-1)
    23. >> stdout.write(f'{group_id} has been stacked.')
    24. >> t, p = stats.ttest_ind(groups['group_a'], groups['group_b'], axis=-1)
    25. >> imgobj = np.concatenate([t[..., np.newaxis], p[..., np.newaxis]], axis=-1)
    26. >> ttest_result = nib.Nifti1Image(imgobj, affine)
    27. >> ttest_result.to_filename(output)
    28. >> stdout.write('{} is created'.format(output))
    29. >> open (f'{output}_report.html', 'w') as f:
    30. >> f.write('<html>Hello world.</html>\n')
    31. >> except Exception:
    32. >> stderr.write('[ERROR] Failed.\n')
    33. >> import traceback
    34. >> traceback.print_exception(*sys.exc_info(), file=stderr)
    35. >> return 1
    36. >> return 0
    37. >>
    38. >> itb = pipe.get_builder()
    39. >> itb.init_step('2ndLevelStatistic', suffix='func', idx=3, subcode=0,
    40. >> mode='reporting', type='python') # for reporting, as a default, output is directory without extension
    41. >> itb.set_input(label='group_a', input_path='01B', group_input=True,
    42. >> filter_dict=dict(regex=r'sub_\d{2}_task-rest.*'), join_modifier=False) # if this is False, input will return
    43. >> # 'list obj' so can run loop within python function
    44. >> itb.set_input(label='group_b', input_path='01B', group_input=True,
    45. >> filter_dict=dict(regex=r'sub_\d{2}_task-active.*'), join_modifier=False)
    46. >> itb.set_output(label='output',
    47. >> modifier='TTest', ext='nii.gz') # for peers to one output
    48. >> itb.set_func(myttestfunc)
    49. >> itb.set_output_checker(label='output') # this will check if TTest.nii.gz is generated but not html file
    50. >> itb.run()
  • Check progression with using progressbar

    1. >> pipe.check_progression()
    2. MyPipeline 50%|████████████████ | 1/2
  • Access a image file absolute path in dataset ‘01A’
    ```python

    pipe.get_dset(‘01A’).df # get_dset metrics will return dataset object
    …will print out the data structure…

file_1 = pipe.get_dset(‘01A’)[0].Abspath # get absolute path of first indexed file
```

  • All the above procedures can be packaged as a Plug-in module to simplify the execution.
  • To get more detail information, please check below links

Regular expression for data filtering

  • Regex patterns using in this module
    • This module use regular expression to search specific filename without extension,
      so the file extension must be provided as separate filter key.
  • Filter key

    • Dataclass specific keys
      • ‘Data’: dataset path (idx:0): subjects, datatypes
      • ‘Processing’: working path (idx:1): pipelines, steps
      • ‘Results’: results path (idx:2): pipelines, reports
      • ‘Mask’: masking path (idx:3): subjects, datatypes
      • ‘Temp’: temporary (idx:4): pipelines, steps
    • File specific keys
      • regex: regex pattern for filename
      • ext: file extension
  • Output filename specification

    • Using prefix, suffix, and/or modifier will result in the change of output filename.
    • The modifier is the key-value paired python dictionary object. the value in key will be searched and will be replaced to the string in value.
    • Or in case of reporting purpose (which the group_input=1), single string can be use as output filename
  • Output checker

    • The output checker required to validate if the result file is generated.
    • If the output filename is same as the one you specified in output, you only need to input the label of your output.
    • However, some tools generate multiple files which result in modifying filename.
    • In this case, you can specify the prefix and suffix here to let the processor knows what file you want to check to validate the success of process.

The StepCode to access data

  • Step code is designed to enhance data accessibility of specific processing node without knowing the data structure.
  • In PyNIPT, each processing step required to assign unique StepCode composed of 3 characters. (e.g. ‘03E’)
    • The first two integers are to identify ‘the level of process’. Total 100 levels (00 to 99) are available.
    • The last one character is to identify ‘the sub-step of each level’. Total 27 sub-step can be specified (0 or A-Z)
    • The sub-step can be used
  • Example folder name of one processing node: ‘01E_MotionCorrection-func’
    • The ‘01E’ is StepCode
    • The ‘MotionCorrection’ is the title of processing node
    • The ‘func’ is the suffix for distinguish the node if the same processing node is used multiple time. (the same folder is not allowed to use multiple time, so using suffix is crucial.)

Tutorials

The tutorial does not ready yet, will be provided soon

License

PyNIPT is licensed under the term of the GNU GENERAL PUBLIC LICENSE Version 3

Authors

Contributors

If you interest in contributing this project, please contact shlee@unc.edu.

Citing PyNIPT

Lee, SungHo, Ban, Woomi, & Shih, Yen-Yu Ian. (2020, May 25). PyNIPT/pynipt: PyNIPT v0.2.1 (Version 0.2.1). Zenodo. http://doi.org/10.5281/zenodo.3842192

  1. @software{lee_sungho_2020_3842192,
  2. author = {Lee, SungHo and
  3. Ban, Woomi and
  4. Shih, Yen-Yu Ian},
  5. title = {PyNIPT/pynipt: PyNIPT v0.2.1},
  6. month = may,
  7. year = 2020,
  8. publisher = {Zenodo},
  9. version = {0.2.1},
  10. doi = {10.5281/zenodo.3842192},
  11. url = {https://doi.org/10.5281/zenodo.3842192}
  12. }