<aside>
🔅 Use cmd/ctrl+shift+l to switch between light/dark display mode.
</aside>
Related pull request:
https://github.com/SynBioDex/SBOL-utilities/pull/135
Change log:
Changes::
- Further added support for the new converter to parse files with multiple records and features, like
test/test_files/iGEM_SBOL2_imports.gb
.
- Added a unit test for the same, even though it's an identical function to the previous unit test currently, I think we can just keep it instead of calling the previous unit test function so that we can add a few more checks later if required as the complexity of GenBank files increases.
- Copied mappings for GenBank to SO ontologies mappings and vice versa from here into files
GB2SO
, SO2GB
(modified the content via vim macros, makes it easier to parse them to create a CSV).
- The file
script.py
at the root level parses the above 2 files to create CSVs gb2so.csv
and so2gb.csv
. The script file, along with the intermediate GB2SO and SO2GB is temporarily needed while we figure out what to do with "many to one" mappings. These can be deleted before finally merging.
- Added test file
test/test_files/iGEM_SBOL2_imports_from_genbank_to_sbol3_direct.nt
, generated from new converter; if the conversion code changes are confirmed, it can serve as a reference for the unit tests.
Pending Fixes::
- [x] Currently, the orientation of FeatureLocation is hardcoded to "inline orientation" ("https://identifiers.org/SO:0001030"); need to create a method to infer it from the GenBank file instead; not sure if it should be done here or in a different PR.
- [x] There should be defaults created for converting between GenBank and SO ontologies mappings if the key is not found in their respective dictionaries.
- [x] Figure out how to manage "many to one" mappings - whether to include all matched values as roles / only the first matched, etc. - New spreadsheet mappings here had no "many to one" mapping ambiguities.
- [x] Also, all FeatureLocation types are Range currently; need to infer whether to choose "Range" or "Cut" from the GenBank file.
- [x] Add CSVs as package data to store them along with the pip installation.
- [x] Add logging, throw warnings for extra GenBank annotations not directly storable in sbol3 files.
- [x] Error handling for ontology mappings CSV parsing for users who may not have updated pip package.
- [x] Remove temp scripts and files used to generate CSV.