Data scientists, who often choose open source packages without considering security, increasingly face concerns over the unvetted use of those components, new study shows.
Vulnerabilities in open source components — such as the widespread flaws revealed 10 months ago in Log4j 2.0 — have forced data scientists to reevaluate the open source code frequently used in analysis and the creation of machine learning models.
According to a report by Anaconda, a data-science platform firm, in the past year, 40% of surveyed data scientists, business analysts, and students have scaled back their use of open source components, while a third remained steady, and only 7% incorporated more open source code into their projects. The majority of those surveyed do not report to the information technology department (18%), but work within their own data science or research and development group (47%), according to Anaconda's "2022 State of Data Science" report, released last week.