PROMETHEUS

stands for

Python Runtime for Operations Management for Ecosystems by Tracking, Handling, and Exchange of Updates via Schemas

An ERP model for assessing deprecations in Python-based packages and software in a unified manner

You rarely write the complete code on your own, rather you use existing functionalities provided by "libraries", but these functionalities can change with time :(

Deprecations?

  • To improve the functionality
  • To make things uniform
  • To give a better name
  • On users' demand
  • ....

Why?

Impact?

  • Update your code to new norms
  • The deprecated code WILL stop working soon
  • Also, do everything yourself

The fundamental package for scientific computing with Python (free and Open-Source).

For example? NumPy!

import numpy

numpy.matrix(
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]
)
import numpy

numpy.array(
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]
)

Your phone, Tesla's self-driving cars, NASA's helicopter on Mars, all use NumPy - 6 billion+ installs

It took them 10 years to deprecate this, because of its widespread usage (read: MASSIVE!)

Uniform syntax, more efficient, less internal code, more compatible, ....

Deprecations are not bad, but dealing with them can be painful

How do I handle them?

You might get to know of a deprecation when your users start complaining - downtime

NumPy recently removed one of its deprecated modules and a majority of libraries the scientific python ecosystem collapsed at once in their CI services. Saransh was personally up the whole night fixing the libraries he maintains.

You will have to fix them manually

You might be tempted to use an older version of dependency, but that will only hurt in the longer run since it would cause compatibility issues later

We see a gap!

As part of this ongoing effort, this project aims to create a uniform, automated solution for assessing and managing API deprecations from upstream packages.

The current efforts are working on clearer deprecation notices, creating migration guides to assist developers in transitioning to newer APIs, and promoting communication between upstream (the package authors) and downstream (the dependent package maintainers) parties

In addition to these communication efforts, there's a push for creating dependency management tools and implementing automated testing practices.

Existing similar tools

Flake8: Flake8 can catch not only syntax errors but also complex or erring constructs.

Pylint: More rigorous than Flake8, it offers more features including the ability to write custom plugins, making it customizable for deprecation warnings.

Mypy: Primarily a type checker, Mypy can also catch certain types of deprecations, especially those related to type annotations and related changes.

Bandit: Focuses on finding common security issues in Python code but can also be configured to flag use of deprecated libraries or functions.

Proposed implementation

1

A library maintains a systematic record of its deprecations (in our proposed JSON schema).

3

Our tool automatically fetches every dependency's schema and runs it against the codebase to pinpoint deprecations - linting.

5

The ecosystem stays up-to-date as much as possible.

2

The library depending on this library adds our tool as a pre-commit hook.

4

The same tool then automatically updated the deprecated API, leaving no space for human work - formatting.

JSON Schema

Most libraries already maintain a "CHANGELOG", which is good but not machine readable (or systematic)

JSON Schema

We propose a new systematic way to store all the deprecations.

{
  "library": "...",
  "version": "...",
  "deprecations": [
    {
      "feature": "...",
      "deprecation_version": "...",
      "replacement": "...",
      "deprecation_reason": "...",
      "additional_metadata": "...",
    }
  ],
  "integration": {
    "rss_feed_support": bool,
    "notification_granularity": "...",
  },
  "error_codes": {
    "fft_deprecation": "...",
  }
}

JSON Schema

Example

{
  "library": "NumPy",
  "version": "1.20.0",
  "deprecations": [
    {
      "feature": "fft",
      "deprecation_version": "1.19.0",
      "replacement": "numpy.fft.fft",
      "deprecation_reason": "The 'fft' function is deprecated in favor of 'numpy.fft.fft' for improved consistency.",
      "additional_metadata": "This change improves code readability and aligns with the common naming convention."
    }
  ],
  "integration": {
    "rss_feed_support": true,
    "notification_granularity": "weekly"
  },
  "error_codes": {
    "fft_deprecation": "NPYFFT103"
  }
}

Integration with existing tools

Pre-Commit

Tools like codacy and codecov

BROADCASTING

Pre-commit CI and its mirrors

RSS (RDF Site Summary / Really Simple Syndication)

*RDF: Resource Description Framework

Economic viability

Work Time Estimated cost
Design and planning 1 year $100,000 (1 full-time manager
Coding and implementation 1 year $200,000 (3 full-time developers)
Testing and debugging 3 months $30,000 (3 contract-based software)
Documentation 4 months $15,000 (possibly out-sourced)
Market research 1 month $1,000

Funding from Open-Source grants - CZI, NumFOCUS, SSI, GSoC, GSoD, Outreachy, and others, but we will have to open-source our work

Gap Analysis & Self Assessment

Building a linter is comparatively easier than building a formatter. 

1

Formatter

Open sourcing our tool would mean no "profits" and it might get a bit tough to maintain the tool in the long run.

2

Open-Source

The adoption of the tool will come into picture only once we launch it. Reaching out to developers for a survey is really hard.

3

Adoption

It is a feature of Python that allows programmers to annotate their code with the expected types of variables, arguments, and return values.

Cost benefit analysis

These thing can help developers write more readable, maintainable, and bug-free code, as well as enable tools such as IDEs, linters, and type checkers to provide better support and feedback.
1. Compatibility Issues: If types used in a project become deprecated, it could lead to compatibility issues with newer versions of Python.

Future implications

2. Documentation Updates: Documentation for the project may need to be updated to reflect the changes in type hinting, ensuring that users can understand and work with the new types effectively.
3. Testing and Quality Assurance: Extensive testing is crucial to ensure that the project functions correctly with the updated types. This may involve creating test cases specifically to cover the type hinting changes.

Thank you!

ERP project

By Saransh Chopra

ERP project

  • 69