Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metricbeat/module/mongodb/collstats: Add extra collstats metrics #42171

Merged
merged 24 commits into from
Jan 20, 2025
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
eb7bbcb
mongo collStats PoC
stefans-elastic Dec 24, 2024
9ae1258
introduce waitgroup
stefans-elastic Dec 26, 2024
065a342
Merge branch 'main' of github.com:stefans-elastic/beats into mongo_co…
stefans-elastic Dec 27, 2024
3cef86a
[metricbeats][mongodb] handle extra collstats metrics
stefans-elastic Dec 27, 2024
b673cd9
fix linter errors
stefans-elastic Dec 27, 2024
7ef7d47
fix imports
stefans-elastic Dec 27, 2024
2eb5554
Merge branch 'main' of github.com:stefans-elastic/beats into mongo_co…
stefans-elastic Dec 30, 2024
0ad0c69
update changelog
stefans-elastic Dec 30, 2024
e6006d1
add max and nindexes to collstats data
stefans-elastic Jan 2, 2025
6cbbccd
Merge branch 'main' of github.com:stefans-elastic/beats into mongo_co…
stefans-elastic Jan 2, 2025
694b9ab
update copyright years in NOTICE.txt
stefans-elastic Jan 2, 2025
e12868d
update NOTICE.txt
stefans-elastic Jan 2, 2025
dff33ce
Merge branch 'main' of github.com:stefans-elastic/beats into mongo_co…
stefans-elastic Jan 3, 2025
ebacc6c
Merge branch 'main' into mongo_collstats
stefans-elastic Jan 15, 2025
ac1f926
Merge branch 'main' of github.com:stefans-elastic/beats into mongo_co…
stefans-elastic Jan 15, 2025
18b6e98
Merge branch 'main' of github.com:stefans-elastic/beats into mongo_co…
stefans-elastic Jan 15, 2025
d133f7f
Merge branch 'main' of github.com:stefans-elastic/beats into mongo_co…
stefans-elastic Jan 17, 2025
30a57f5
impove code readability, add code comments
stefans-elastic Jan 17, 2025
8efa46e
replace WaitGroup with errgroup
stefans-elastic Jan 17, 2025
ddef8aa
Merge branch 'mongo_collstats' of github.com:stefans-elastic/beats in…
stefans-elastic Jan 17, 2025
a56948c
run gofumpt to fix imports
stefans-elastic Jan 17, 2025
48fe367
fix loop variable captured by func literal
stefans-elastic Jan 17, 2025
d667587
Merge branch 'main' into mongo_collstats
stefans-elastic Jan 20, 2025
c58807f
Merge branch 'main' into mongo_collstats
shmsr Jan 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Collect .NET CLR (IIS) Memory, Exceptions and LocksAndThreads metrics {pull}41929[41929]
- Added `tier_preference`, `creation_date` and `version` fields to the `elasticsearch.index` metricset. {pull}41944[41944]
- Add `use_performance_counters` to collect CPU metrics using performance counters on Windows for `system/cpu` and `system/core` {pull}41965[41965]
- Add support of additional `collstats` metrics in mongodb module. {pull}42171[42171]
- Preserve queries for debugging when `merge_results: true` in SQL module {pull}42271[42271]

*Metricbeat*
Expand Down
81 changes: 81 additions & 0 deletions metricbeat/docs/fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -50999,6 +50999,87 @@ type: long
Number of database commands executed.


type: long

--


*`mongodb.collstats.stats.stats.size`*::
+
--
The total uncompressed size in memory of all records in a collection.


type: long

--

*`mongodb.collstats.stats.stats.count`*::
+
--
The number of objects or documents in this collection.


type: long

--

*`mongodb.collstats.stats.stats.avgObjSize`*::
+
--
The average size of an object in the collection.


type: long

--

*`mongodb.collstats.stats.stats.storageSize`*::
+
--
The total amount of storage allocated to this collection for document storage.


type: long

--

*`mongodb.collstats.stats.stats.totalIndexSize`*::
+
--
The total size of all indexes.


type: long

--

*`mongodb.collstats.stats.stats.totalSize`*::
+
--
The sum of the storageSize and totalIndexSize.


type: long

--

*`mongodb.collstats.stats.stats.max`*::
+
--
Shows the maximum number of documents that may be present in a capped collection.


type: long

--

*`mongodb.collstats.stats.stats.nindexes`*::
+
--
The number of indexes on the collection. All collections have at least one index on the _id field.


type: long

--
Expand Down
12 changes: 11 additions & 1 deletion metricbeat/module/mongodb/collstats/_meta/data.json
Original file line number Diff line number Diff line change
Expand Up @@ -69,11 +69,21 @@
"time": {
"us": 0
}
},
"stats": {
"totalSize": 8192,
"max": 5000,
"nindexes": 1,
"size": 36,
"count": 1,
"avgObjSize": 36,
"storageSize": 4096,
"totalIndexSize": 4096
}
}
},
"service": {
"address": "172.28.0.5:27017",
"type": "mongodb"
}
}
}
36 changes: 36 additions & 0 deletions metricbeat/module/mongodb/collstats/_meta/fields.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,39 @@
type: long
description: >
Number of database commands executed.

- name: stats
type: group
fields:
- name: stats.size
type: long
description: >
The total uncompressed size in memory of all records in a collection.
- name: stats.count
type: long
description: >
The number of objects or documents in this collection.
- name: stats.avgObjSize
type: long
description: >
The average size of an object in the collection.
- name: stats.storageSize
type: long
description: >
The total amount of storage allocated to this collection for document storage.
- name: stats.totalIndexSize
type: long
description: >
The total size of all indexes.
- name: stats.totalSize
type: long
description: >
The sum of the storageSize and totalIndexSize.
- name: stats.max
type: long
description: >
Shows the maximum number of documents that may be present in a capped collection.
- name: stats.nindexes
type: long
description: >
The number of indexes on the collection. All collections have at least one index on the _id field.
62 changes: 49 additions & 13 deletions metricbeat/module/mongodb/collstats/collstats.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,13 @@ import (
"context"
"errors"
"fmt"
"sync"

"github.com/elastic/beats/v7/metricbeat/mb"
"github.com/elastic/beats/v7/metricbeat/module/mongodb"

"go.mongodb.org/mongo-driver/bson"
"go.mongodb.org/mongo-driver/mongo"
)

func init() {
Expand Down Expand Up @@ -70,10 +72,6 @@ func (m *Metricset) Fetch(reporter mb.ReporterV2) error {
}
}()

if err != nil {
return fmt.Errorf("could not get a list of databases: %w", err)
}

// This info is only stored in 'admin' database
db := client.Database("admin")
res := db.RunCommand(context.Background(), bson.D{bson.E{Key: "top"}})
Expand All @@ -95,6 +93,12 @@ func (m *Metricset) Fetch(reporter mb.ReporterV2) error {
return errors.New("collection 'totals' are not a map")
}

if err = res.Err(); err != nil {
return fmt.Errorf("'top' command failed: %w", err)
}

wg := &sync.WaitGroup{}

for group, info := range totals {
if group == "note" {
continue
Expand All @@ -106,16 +110,48 @@ func (m *Metricset) Fetch(reporter mb.ReporterV2) error {
continue
}

event, err := eventMapping(group, infoMap)
if err != nil {
reporter.Error(fmt.Errorf("mapping of the event data filed: %w", err))
continue
}

reporter.Event(mb.Event{
MetricSetFields: event,
})
wg.Add(1)
go func(eventReporter mb.ReporterV2, mongoClient *mongo.Client, group string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what's the rationale behind using goroutines here? Also, even if we are keeping it can be do bounded concurrency. Dont wanna fire too many queries and burden the customer's MongoDB server.

    sem := make(chan struct{}, 10) // Limit concurrent operations
    
    for group, info := range totals {
        sem <- struct{}{} // Acquire
        go func() {
            defer func() { <-sem }() // Release
            // Existing goroutine code
        }()
    }

I mean, can we add semaphore or worker pool to limit the concurrency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIP: Also, without bounded concurrency, errgroup is cleaner way to implement this: https://pkg.go.dev/golang.org/x/sync/errgroup

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to this:
For large datasets, the number of goroutines can overwhelm the system. Let's use workerpool to limit the concurrency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced WaitGroup with errgroup. Btw, I set goroutins limit to 10 like was suggested in the example. Is 10 a good limit or should I change it to something else?

defer wg.Done()

names, err := splitKey(group)
if err != nil {
eventReporter.Error(fmt.Errorf("splitting a collection key failed: %w", err))
return
}

collStats, err := fetchCollStats(mongoClient, names[0], names[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better naming for names[0], names[1] please

if err != nil {
eventReporter.Error(fmt.Errorf("fetching collStats failed: %w", err))
return
}

infoMap["stats"] = collStats

event, err := eventMapping(group, infoMap)
if err != nil {
eventReporter.Error(fmt.Errorf("mapping of the event data failed: %w", err))
return
}

eventReporter.Event(mb.Event{
MetricSetFields: event,
})
}(reporter, client, group)
}

wg.Wait()

return nil
}

func fetchCollStats(client *mongo.Client, dbName, collectionName string) (map[string]interface{}, error) {
db := client.Database(dbName)
colStats := db.RunCommand(context.Background(), bson.M{"collStats": collectionName})
var statsRes map[string]interface{}
if err := colStats.Decode(&statsRes); err != nil {
return nil, fmt.Errorf("could not decode mongo response for database=%s, collection=%s: %w", dbName, collectionName, err)
}

return statsRes, nil
}
27 changes: 23 additions & 4 deletions metricbeat/module/mongodb/collstats/data.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,9 @@ import (
)

func eventMapping(key string, data mapstr.M) (mapstr.M, error) {
names := strings.SplitN(key, ".", 2)

if len(names) < 2 {
return nil, errors.New("collection name invalid")
names, err := splitKey(key)
if err != nil {
return nil, err
}

event := mapstr.M{
Expand Down Expand Up @@ -91,6 +90,16 @@ func eventMapping(key string, data mapstr.M) (mapstr.M, error) {
},
"count": mustGetMapStrValue(data, "commands.count"),
},
"stats": mapstr.M{
"size": mustGetMapStrValue(data, "stats.size"),
"count": mustGetMapStrValue(data, "stats.count"),
"avgObjSize": mustGetMapStrValue(data, "stats.avgObjSize"),
"storageSize": mustGetMapStrValue(data, "stats.storageSize"),
"totalIndexSize": mustGetMapStrValue(data, "stats.totalIndexSize"),
"totalSize": mustGetMapStrValue(data, "stats.totalSize"),
"max": mustGetMapStrValue(data, "stats.max"),
"nindexes": mustGetMapStrValue(data, "stats.nindexes"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a lot of fileds which we get here @shmsr .
We planned to list down the most useful one.

},
}

return event, nil
Expand All @@ -100,3 +109,13 @@ func mustGetMapStrValue(m mapstr.M, key string) interface{} {
v, _ := m.GetValue(key)
return v
}

func splitKey(key string) ([]string, error) {
names := strings.SplitN(key, ".", 2)

if len(names) < 2 {
return nil, errors.New("collection name invalid")
}

return names, nil
}
2 changes: 1 addition & 1 deletion metricbeat/module/mongodb/fields.go

Large diffs are not rendered by default.

Loading