Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elaborate on 'upsertManyWhere' #1537

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lfborjas
Copy link

@lfborjas lfborjas commented May 30, 2024

Expand the documentation of the Postgres version of upsertManyWhere: it wasn't clear how the second argument behaves and at least two coders interpreted it backwards introducing a subtle production bug; hopefully the examples now make it clear!

I don't have a dev env setup for this so I didn't run stylish-haskell, nor do I know if a documentation update warrants a version bump — happy to comply if necessary though!

Before submitting your PR, check that you've:

  • Documented new APIs with Haddock markup
  • Added @since declarations to the Haddock
  • Ran stylish-haskell on any changed files.
  • Adhered to the code style (see the .editorconfig file for details)

After submitting your PR:

  • Update the Changelog.md file with a link to your PR
  • Bumped the version number if there isn't an (unreleased) on the Changelog
  • Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts)

Copy link
Collaborator

@parsonsmatt parsonsmatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big improvement, thank you! Left a few comments for further improvement

persistent-postgresql/Database/Persist/Postgresql.hs Outdated Show resolved Hide resolved
persistent-postgresql/Database/Persist/Postgresql.hs Outdated Show resolved Hide resolved
Comment on lines +1719 to +1729
-- If there's unique key collisions for some or all of the proposed insertions,
-- you can use the second argument to specify which fields, under which
-- conditions (using 'HandleUpdateCollision' strategies,) will be copied from
-- each proposed, conflicting new record onto its corresponding row already present
-- in the database. Helpers such as 'copyField', 'copyUnlessEq',
-- 'copyUnlessNull' and 'copyUnlessEmpty' are exported from this module to help
-- with this, as well as the constructors for 'HandleUpdateCollision'.
-- For example, you may have a patch of updates to apply, some of which are NULL
-- to represent "no change"; you'd use 'copyUnlessNull' for each applicable
-- field in this case so any new values in the patch which are NULL don't
-- accidentally overwrite existing values in the database which are non-NULL.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is a little confusing to me (but, hey, the function itself is a bit confusing!) I think if I were to write it, I'd say something like:

The second list describes what happens if one of the records to be inserted conflicts with a unique key in the database. The 'HandleUpdateCollision' type has more documentation on this. Briefly, we can:

  1. copyField will copy the value from the record we tried to insert into the record in the database table.
  2. copyUnlessEq (describe this)
  3. copyUnlessNull (describe this)
  4. etc

Copy link
Author

@lfborjas lfborjas May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the more abridged approach, thank you! — I'm not 100% sure but, if I follow the docs I think the one of the records part might be a lil bit misleading, since there could be multiple collisions each of which will be resolved in the ON CONFLICT clause (in my mind one of means that only one collision would be resolved? Maybe I'm reading too deep into it.)

Maybe something like what happens for a record proposed for insertion whose unique key conflicts with an existing record in the database (as you can see, the more precise I attempt to get, the more I veer into verbosity lol)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point! Thanks.

It is a rather tricky thing to articulate!

The second list describes what happens to the records which have unique key conflicts with existing rows in the database. The HandleUpdateCollision type documents this more thoroughly. Briefly,

  1. copyField will update the row in the database with the value of the field of the record we attempted to insert.

is another attempt, but I'm not thrilled with it.

But let me try and break this down:

  1. We have a [record] that we want to upsert into the database.
  2. If the record does not exist, then we insert it.
  3. If the record already exists by some Unique record, then we use [HandleUpdateCollision record] to specify how to handle the collision for that record.

You know, it's occuring to me that this is much easier to express in the singular. For example, upsertWhere :: rec -> [HandleUpdateCollision rec] -> [Update rec] -> [Filter rec] -> m (). We can talk about that like:

Attempt to insert the record into the database. If there is a unique key conflict, and the [Filter rec] are satisfied for the record we are attempting to insert, then we use the [HandleUpdateCollision record] and [Update rec] to specify how to update the record.

I think that [Update rec] aren't included in upsertWhere because it's easy enough to just edit the record itself and copy the field in.

But if upsertWhere is documented in an easy to explain manner, then we can make upsertManyWhere as:

Like upsertWhere, but does a batch upsert for all of the records in the list.

Comment on lines +1731 to +1734
-- Further updates to the matching records already in the database can be
-- specified in the third argument, these are arbitrary updates independent of
-- the new records proposed for insertion. For example, you can use this to
-- update a timestamp or a counter for all conflicting rows.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might start with "The third argument is a list containing updates..." or similar, so the reader knows that we're now talking about the third argument. I'd also front-load that the update only happens if a conflict occurred.

Possibly,

The third argument specifies arbitrary Updates you can perform on conflicting rows.

With some more elaboration

Comment on lines +1736 to +1740
-- You can use the fourth argument to provide arbitrary conditions to constrain
-- the above updates. This way, you can choose to only alter a subset of the
-- conflicting existing rows while leaving the others alone. For example, you
-- may want to copy fields from new records and update a timestamp only when the
-- corresponding existing row hasn't been soft-deleted.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic example!

Will also be working in the rephrasing requested — stay tuned!

Co-authored-by: Matt Parsons <parsonsmatt@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants