Add DB serialization support #132

cretz · 2021-12-27T19:14:15Z

Added support for https://www.sqlite.org/c3ref/serialize.html and https://www.sqlite.org/c3ref/deserialize.html.

FiloSottile · 2022-02-23T11:54:22Z

I was about to send a PR to expose the same functions!

IMHO, bundling the schema name and the serialization does not make much sense, though. The schema name is a context-specific choice, not a property of the serialization. For example, one can serialize the main database, and then deserialize it later as snapshot.

FiloSottile · 2022-02-23T12:11:59Z

serialize.go

+
+// Reopens the database as in-memory representation of given serialized bytes.
+// The given *Serialized instance should remain referenced (i.e. not GC'd) for
+// the life of the DB since the bytes within are referenced directly.


This is not something the caller can guarantee, short of importing go4.org/unsafe/assume-no-moving-gc.

Generally, it's probably not a good idea to let C and Go memory cross the boundaries like this. It adds complexity including the need for the finalizers, and the invalid state of Serialized values after being used.

In Serialize, I'd always make a copy into Go memory unless SQLITE_SERIALIZE_NOCOPY is set. That way there is never a need for freeing the memory. If an application can't stand copies for performance reasons, they can use SQLITE_SERIALIZE_NOCOPY. Otherwise, they can probably tolerate two as well as one.

In Deserialize, I would always copy the input into sqlite3_malloc64'd memory, and force SQLITE_DESERIALIZE_FREEONCLOSE on. This is unfortunate in terms of memory usage, especially when using //go:embed which is my use case, and basically precludes the use of the szBuf > szDb feature, but it is the only safe API because we can't rely on Go memory staying still.

The good news is that this removes any need for the Serialized type.

Hrmm, yeah all my machinations here were to avoid copies, but it sounds like I can't have a stable pointer. So I guess we have to have copies.

I did this for a side project (https://github.com/cretz/temporal-sdk-go-advanced/tree/main/temporalsqlite for SQLite on https://temporal.io/) so it may be a few days until I can get to simplifying. If needing sooner or you have the bandwidth, feel free to steal the impl here and simplify w/ always-copy.

Oh, actually it might be worth waiting for https://go.dev/issue/46787. I wonder if we can provide the same API I describe above, and implement it with copies in Go 1.18, and with Pinner in Go 1.19.

golang/go#46787 has landed.

cretz · 2022-02-23T14:41:14Z

IMHO, bundling the schema name and the serialization does not make much sense, though.

Agreed.

AdamSLevy · 2022-06-18T19:31:39Z

serialize.go

+// Bytes returns the serialized bytes. Do not mutate this value. This is only
+// valid for the life of its receiver and should be copied for any other
+// longer-term use.
+func (s *Serialized) Bytes() []byte { return s.data }


I would prefer more robust API semantics for this. This is too error prone. Assume the user of this API is not going to read your docs. How can we make it fool proof. Consider that there is virtually no case where someone is going to instantiate a Serialized object and not copy the data. So why bother with the intermediary? Why even have a Serialized object at all?

IMO the ideal API should just accept a Writer where the serialized data is written, or accept a Reader where the serialized data can be read from so it can be instantiated into the database.

AdamSLevy · 2022-06-18T19:33:04Z

serialize.go

+	if len(s.data) > 0 && s.shouldFreeData {
+		s.shouldFreeData = false
+		s.sqliteOwnsData = false
+		C.sqlite3_free(unsafe.Pointer(&s.data[0]))
+	}


complex logic with no comments. Why is this necessary?

AdamSLevy · 2022-06-18T19:38:08Z

serialize.go

+// the database.
+//
+// https://www.sqlite.org/c3ref/serialize.html
+func (conn *Conn) Serialize(schema string, flags ...SerializeFlags) *Serialized {


Accept a writer and write the copied serialized data to it.

AdamSLevy · 2022-06-18T19:50:00Z

serialize.go

+const (
+	SQLITE_DESERIALIZE_FREEONCLOSE DeserializeFlags = C.SQLITE_DESERIALIZE_FREEONCLOSE
+	SQLITE_DESERIALIZE_RESIZEABLE  DeserializeFlags = C.SQLITE_DESERIALIZE_RESIZEABLE
+	SQLITE_DESERIALIZE_READONLY    DeserializeFlags = C.SQLITE_DESERIALIZE_READONLY
+)


I think this is too low level to expose in the Go API here. When we deserialize we are taking go memory with the serialized DB and malloc-ing a space of C memory for which sqlite can operate an in-memory database. I think the only option we should expose is READONLY and maybe RESIZEABLE. I think since we are managing the C memory, we should control FREEONCLOSE, which we should probably set, so we don't have to continue to manage that memory.

AdamSLevy · 2022-06-18T19:53:03Z

serialize.go

+// The Serialized parameter should no longer be used after this call.
+//
+// https://www.sqlite.org/c3ref/deserialize.html
+func (conn *Conn) Deserialize(s *Serialized, flags ...DeserializeFlags) error {


I recommend this function signature:

Suggested change

func (conn *Conn) Deserialize(s *Serialized, flags ...DeserializeFlags) error {

func (conn *Conn) Deserialize(serialized io.Reader, size, additional int, flags ...DeserializeFlags) error {

And then use size+additional to malloc the C memory and then copy the contents of serialized into that memory.

Set FREEONCLOSE so we don't have to continue to manage that C memory.

Add DB serialization support

903935e

FiloSottile reviewed Feb 23, 2022

View reviewed changes

AdamSLevy requested changes Jun 18, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DB serialization support #132

Add DB serialization support #132

cretz commented Dec 27, 2021

FiloSottile commented Feb 23, 2022

FiloSottile Feb 23, 2022

cretz Feb 23, 2022

FiloSottile Feb 25, 2022

mitar Jan 2, 2024

cretz commented Feb 23, 2022

AdamSLevy Jun 18, 2022

AdamSLevy Jun 18, 2022

AdamSLevy Jun 18, 2022

AdamSLevy Jun 18, 2022

AdamSLevy Jun 18, 2022

	func (conn Conn) Deserialize(s Serialized, flags ...DeserializeFlags) error {
	func (conn *Conn) Deserialize(serialized io.Reader, size, additional int, flags ...DeserializeFlags) error {

Add DB serialization support #132

Are you sure you want to change the base?

Add DB serialization support #132

Conversation

cretz commented Dec 27, 2021

FiloSottile commented Feb 23, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz commented Feb 23, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment