Jamie Tanna (https://www.jvt.me)
Senior Software Engineer @ Elastic
/usr/bin/whoami
As I wrote in the post Analysing our dependency trees to determine where we should send Open Source contributions for Hacktoberfest
But it's not always ☀🌈
💖
Do you fully appreciate the depth of your dependency on the software supply chain?
Being able to understand how your business uses Open Source is really important for a few other key reasons:
Being able to understand how your business uses Open Source internal software is really important for a few other key reasons:
Other insights into:
How can we do it?
💰🤑💸
Let's use Open Source!
Dependency Management Data (DMD) - dmd.tanna.dev
What's in the project?
dmd
dmd-web
, and the GraphQL-only web application dmd-graph
SQLite database
dmd
- all state synced to the DB
dmd
dmd-web
sql-studio
's excellent SQLite UI
dmd-graph
commit 73a99614a2af6fa9f66508bab8541ed65e18ed66
Author: Jamie Tanna <>
Date: Thu Feb 2 09:23:43 2023 +0000
Initialise project
LICENSE.md | 13 +++++++++++++
README.md | 7 +++++++
public/index.html | 90 ++++++++++++++++++++++++++++++++++++
3 files changed, 110 insertions(+)
# produce some data that DMD can import, i.e.
npx @jamietanna/renovate-graph@latest --token $GITHUB_TOKEN your-org/repo
# set up the database
dmd db init --db dmd.db
# import renovate-graph data
dmd import renovate --db dmd.db 'out/*.json'
# optionally, generate advisories
dmd db generate advisories --db dmd.db
# then you can start querying it
sqlite3 dmd.db 'select count(*) from renovate'
# set up the database
dmd db init --db dmd.db
# taking an SBOM that was produced from the GitLab repo
# https://gitlab.com/tanna.dev/dependency-management-data
dmd import sbom --db dmd.db '/path/to/sbom.json' --platform gitlab
--organisation tanna.dev
--repo dependency-management-data
# take an SBOM that was produced in some unknown place,
# and auto-detect what we can
dmd import sbom '/path/to/sbom.json'
# take an SBOM that was produced by a vendor
dmd import sbom '/path/to/sbom.json' --vendor ExampleCorp
--product 'Web Server' --product-version 5.0.0
# optionally, generate advisories
dmd db generate advisories --db dmd.db
# then you can start querying it
sqlite3 dmd.db 'select count(*) from renovate'
When importing:
Once ingested, write SQL to your heart's content 🤓 i.e.
Right now we can write SQL queries to ask:
what Terraform modules and versions are being used across the org?
which teams are using the Gin web framework?
But what if we could ask:
are any of our projects relying on libraries that no longer are recommended by our language guilds?
how much time should my team(s) be planning in the next quarter to upgrade their AWS infrastructure?
Dependency advisory data sources:
AWS infrastructure advisory data sources:
🤫
Custom advisories 🦸
Community provided advisories via -contrib:
INSERT INTO custom_advisories (
package_pattern,
package_manager,
version,
version_match_strategy,
advisory_type,
description
) VALUES (
'github.com/golang/mock',
'gomod',
NULL,
NULL,
'UNMAINTAINED',
'golang/mock is no longer maintained, and active development been moved to github.com/uber/mock'
);
// Versions of Gin >= 1.9
deny contains msg if {
input.dependency.package_manager in {"gomod", "golang"}
input.dependency.package_name = "github.com/gin-gonic/gin"
versions[0] in {"v1", "1"}
to_number(versions[1]) >= 9
msg := sprintf("%s. Versions of Gin since v1.9.0 have shipped " +
"ByteDance/sonic as an optional dependency, but it still " +
"appears as a dependency, and could be in use - more " +
"details in " +
"https://github.com/gin-gonic/gin/issues/3653", [prefix])
}
"What package advisories do I have?"
organisation | repo | package_name | current_version | dep_types | advisory_type | description |
---|---|---|---|---|---|---|
alphagov | pay-selfservice | node | 18.20.4 | ["engines"] | DEPRECATED | nodejs 18 has been unsupported (usually only receiving critical security fixes) for 400 days |
elastic | beats | github.com/golang/mock | v1.6.0 | ["require"] | UNMAINTAINED | golang/mock is no longer maintained, and active development been moved to github.com/uber/mock |
monzo | response | python | 3.7 | [] | UNMAINTAINED | python 3.7 has been End-of-Life for 513 days |
via https://dependency-management-data-example.fly.dev/report/advisories
# there may be some folks using YAML anchors
sidecars: &sidecars
# or there could also be comments in here!
kafka: 1.2.3
app:
image: "internal.docker.registry/service-name"
*sidecars
In the renovate
table:
repo | package_name | current_version | package_file_path |
---|---|---|---|
good-service | internal-docker.tld/kafka | 0.3.0 | .hopper.yml |
affected-service | internal-docker.tld/kafka | 0.2.1 | .hopper.yml |
also-affected-service | internal-docker.tld/kafka | 0.1.0 | .hopper.yml |
Parsed via https://docs.renovatebot.com/modules/manager/regex/
And the repository_metadata
table:
repo | additional_metadata |
---|---|
good-service | {"tier": "tier_1"} |
affected-service | {"tier": "tier_1"} |
also-affected-service | {"tier": "tier_2"} |
Slightly garish query:
select
renovate.organisation,
renovate.repo,
current_version,
owner,
json_extract(additional_metadata, '$.tier') as tier
from
renovate
left join owners on
renovate.platform = owners.platform
and renovate.organisation = owners.organisation
and renovate.repo = owners.repo
left join repository_metadata on
renovate.platform = repository_metadata.platform
and renovate.organisation = repository_metadata.organisation
and renovate.repo = repository_metadata.repo
where
-- NOTE: that this is performed with a lexicographical match, which is NOT
-- likely to be what you are expecting to perform version constraint matching
-- but this is a good start for these use cases
renovate.current_version < '0.3'
order by
tier ASC
Result:
organisation | repo | current_version | owner | tier |
---|---|---|---|---|
deliveroo | affected-service | 0.2.1 | Grocery | tier_1 |
deliveroo | also-affected-service | 0.1.0 | tier_2 |
We could use the dependenton
query:
# for Gradle projects
$ dmd report dependenton --db dmd.db --package-manager gradle
--package-name org.apache.logging.log4j:log4j-core
+-------------------+---------+-------------------+----------------------------+-------------+
| REPO | VERSION | DEPENDENCY TYPES | FILEPATH | OWNER |
+-------------------+---------+-------------------+----------------------------+-------------+
| logstash | 2.17.1 | ["dependencies"] | logstash-core/build.gradle | Elastic |
| logstash | 2.17.1 | ["dependencies"] | logstash-core/build.gradle | Elastic |
| fake-private-repo | 2.13.0 | ["dependencies"] | blank-java/build.gradle | Jamie Tanna |
| fake-private-repo | 2.13.0 | ["dependencies"] | blank-java/build.gradle | Jamie Tanna |
+-------------------+---------+-------------------+----------------------------+-------------+
Or start with SQL:
select
platform,
organisation,
repo,
current_version
from
renovate
where
package_name = 'org.apache.logging.log4j:log4j-core'
With the versions affected:
select
platform,
organisation,
repo,
current_version
from
renovate
where
package_name = 'org.apache.logging.log4j:log4j-core'
and current_version in (
'2.0-beta9', '2.0-rc1',
'2.0-rc2', '2.0.1',
'2.0.2', '2.0',
-- ....
-- ....
-- ....
'2.13.0', '2.13.1',
'2.13.2', '2.13.3',
'2.14.0', '2.14.1',
)
Or with a Policy:
default advisory_type := "SECURITY"
versions := split(input.dependency.version, ".")
major := to_number(versions[0])
minor := to_number(versions[1])
patch := to_number(versions[2])
is_log4j2 if {
input.dependency.package_manager in {"gradle", "maven"}
input.dependency.package_name =
"org.apache.logging.log4j:log4j-core"
major == 2
}
// ...
Or with Policy (continued):
// ...
// CVE-2021-44228 aka Log4shell affects versions 2.0-beta9 to
// 2.14.1
is_vulnerable_version if input.dependency.version in {
"2.0-beta9", "2.0-rc1", "2.0-rc2"}
// CVE-2021-44228 aka Log4shell affects versions 2.0-beta9 to
// 2.14.1
is_vulnerable_version if {
minor > 0
minor <= 14
}
deny contains msg if {
is_log4j2
is_vulnerable_version
msg := "Dependency is vulnerable to Log4Shell CVE " +
"(CVE-2021-44228)"
}
# produce some data that DMD can import, i.e.
npx @jamietanna/renovate-graph@latest --token $GITHUB_TOKEN \
your-org/repo another-org/repo
# or for GitLab
env RENOVATE_PLATFORM=gitlab npx @jamietanna/renovate-graph@latest \
--token $GITLAB_TOKEN your-org/repo another-org/nested/repo
# set up the database
dmd db init --db dmd.db
# import renovate-graph data
dmd import renovate --db dmd.db 'out/*.json'
# then you can start querying it
sqlite3 dmd.db 'select count(*) from renovate'