• CramHacks
  • Posts
  • Stop F@#!ing (Forking) Your Dependencies

Stop F@#!ing (Forking) Your Dependencies

98% of PyMySQL forks are vulnerable to SQL Injection

Ever wonder what happens when you fork a dependency instead of working with the original?

Spoiler: It’s usually not the shortcut you’d hope for. 🙈 This post dives into why forking can hamstring your project and how to avoid getting stuck in this common software development trap.

This article also applies to software dependencies managed via internal repositories (E.g., Artifactory & Nexus Repository). Mitigating distribution risks is great, but things can get ugly if your repository gets stale!

Why We Fork: The Good, The Bad, and the Ugly

Forking a project often starts with the best intentions. Maybe you need a tweak that the original maintainers aren't keen on, or perhaps you're itching to get new features out the door without battling through the pull request review process. Whatever the case, the allure of instant control is tempting. But here's the catch...

The Risks of Running Rogue

Forking a repository means continuously integrating updates from the original project—including essential security patches and performance enhancements—while ensuring they fit with your custom changes. This ongoing maintenance demands vigilant oversight to prevent disruptions to your modifications. Without timely updates, your fork risks becoming outdated, less secure, and less efficient.

Furthermore, alignment with the parent project requires sustained effort. Lacking support from the original community, you must manage these updates alone, shifting resources from innovation to basic maintenance. This isolation limits your project's growth, as maintaining compatibility precedes developing new features.

A significant, often overlooked risk:

👍️ Scenario:

  1. Your application uses a known vulnerable software dependency (‘Foo’).

  2. You run a vulnerability scanner. It identifies that you’re using ‘Foo’ and alerts you of the finding.

👎️ Versus:

  1. Your application uses a fork of a known vulnerable software dependency (‘Foo-Fork’).

  2. You run a vulnerability scanner. It identifies ‘Foo-Fork’ but fails to associate it with the known vulnerable ‘Foo’ package.

Therefore, the fastest way to remediate all known third-party package vulnerabilities is to fork and rename them! Please don’t do this 🤣.

The PyMySQL project is interesting because Tidelift pays the maintainers to implement industry-leading secure software development practices and document their practices. The project has 7.6K GitHub stars, 230K users, and 1.4K forks.

The vulnerability (CVE-2024-36039) disclosed that versions before 1.1.1 allow SQL injection if used with untrusted JSON input because keys are not escaped by the escape_dict() function. The patch was committed on May 21st, 2024.

def escape_dict(val, charset, mapping=None):
    n = {}
    for k, v in val.items():
        quoted = escape_item(v, charset, mapping)
        n[k] = quoted
    return n

So, how many of these 1.4K forks are also vulnerable? To find out, I cloned the latest version of every available fork (832 total) and used Semgrep to verify the function was present and vulnerable.

The most popular fork (163 stars) does not use the vulnerable code verbatim but is still vulnerable. Style changes don’t remediate vulnerabilities, but they certainly make merging commits from the main repo difficult.

def escape_dict(val, charset):
    return dict([(k, escape_item(v, charset)) for k, v in val.items()])

I doubt anyone will find this shocking after thinking about it for 30 seconds, but that’s just the thing: people aren’t typically thinking about it!

But If You Must…

Sometimes, forking is unavoidable. If you find yourself in that boat, here’s what you can and should do:

  • Stay Updated: Regularly pull changes from the original repository to keep your fork up-to-date.

  • Contribute Back: If possible, contribute your improvements back to the original repo. It reduces your maintenance load and benefits the community.

  • Automate What You Can: Use CI/CD pipelines to automate testing and syncing with the original project, ensuring you catch issues early.

Better Yet, Don't Fork

Before you fork, consider these alternatives:

Contribute Directly: Engage with the original project. It’s more collaborative and often leads to better outcomes.

Plugins or Extensions: Can you extend functionality without forking? Many projects support plugins that can offer the customization you need without the mess.

Choose Wisely: Opt for actively maintained projects receptive to community input. It makes all the difference.

The Forking Conclusion

Forking might seem like a quick fix, but it’s often a one-way ticket to technical debt and security nightmares. Before you fork, consider your project's long-term impact and explore alternatives.

Until Next Time! 👋 

Hey, you made it to the bottom – thanks for sticking around!

Questions, ideas, or want to chat? Slide into my inbox! 💌

Don’t hesitate to forward if someone could benefit from this.

See you next Monday!
-Kyle

P.S. CramHacks now has a Supporter tier! You can upgrade here to support CramHacks and its free weekly content 😃.