Reproducing the NEB Tm calculator behavior in pydna #237

BjornFJohansson · 2024-06-25T16:39:46Z

As per the 4th Pydna meeting, it was generally agreed that Pydna and tools that depend on it would benefit from being able to simulate modern commercial Tm calculators offline.

I have compiled data around this here

The NEB.ipynb notebook contain attempts to replicate the NEB Tm calculator simply by selecting parameters in the biopython Tm module. This was successful for the standard Taq polymerase with the buffer composition stated by NEB.

I was not able to guess the result for the Q5 polymerase. A contributing factor is the secrecy of the buffer composition.

The NEB logics seems contained in the file NEB/NEB_website/NEB Tm Calculator_files/main-3d92a74abb.js

I formatted the code with an online js formatter in the file main-3d92a74abb_formatted.js.

The variables below seem important, but I was not able to follow the logics.

r.TmCalc.prototype.nnBr
r.TmCalc.prototype.nnSa
r.TmCalc.prototype.nnde
r.TmCalc.prototype.nndhds
r.TmCalc.prototype.nnloop
r.TmCalc.prototype.nnbulge
r.TmCalc.prototype.nntmm
r.TmCalc.prototype.R=1.987,
r.TmCalc.prototype.dSBr
r.TmCalc.prototype.dSSa
r.TmCalc.prototype.dHBr
r.TmCalc.prototype.dHSa
r.TmCalc.prototype.init
r.TmCalc.prototype.saltCorrect
r.TmCalc.prototype.setCt
r.TmCalc.prototype.setMonosalt
r.TmCalc.prototype.setDisalt
r.TmCalc.prototype.setDMSO
r.TmCalc.prototype.buildPairMap

Some other people have worked on similar issues here and here

I think this is worth spending some time, since interestingly, the Thermofisher Tm calculator here gives similar, but not identical results and claims to use similar but not identical thermodynamic data.

The text was updated successfully, but these errors were encountered:

manulera · 2024-06-25T17:55:25Z

@BjornFJohansson @hiyama341 , it would be great if you could provide:

The place where you could see the javascript code
An example request to the endpoint

BjornFJohansson · 2024-06-27T07:16:38Z

Nearest Neighbor Method explained: https://www.youtube.com/watch?v=jy5h9NwBEgk

BjornFJohansson · 2024-06-28T14:38:56Z

Minimal example extracted from teemi. The results are identical to the web interface.

import json
import requests

# NEB example primers:
primers = """\
AGCGGATAACAATTTCACACAGGA
GTAAAACGACGGCCAGT
AGCGGATAACAATTTCAC
AGCGGATAAGGGCAATTTCAC
GTAAAACGACGGCCA""".splitlines()

conc=0.5
prodcode="q5-0"
url = "https://tmapi.neb.com/tm/batch"

for primer in primers:
    seqpairs = [[primers]]
    input_ = {"seqpairs": seqpairs, "conc": conc, "prodcode": prodcode}
    headers = {"content-type": "application/json"}
    res = requests.post(url, data=json.dumps(input_), headers=headers)
    r = json.loads(res.content)
    
    if r["success"]:
        for row in r["data"]:
            print(row["tm1"])
    else:
        print("request failed")
        print(r["error"][0])
        
# Result  
# 66
# 62
# 57
# 65
# 59

MarkusPiotrowski · 2024-06-30T18:16:04Z

I had settings for Biopython's MeltingTemp to simulate Q5 and Phusion Tm calculations on the NEB website some time ago (at that time, calculations for Q5 and Phusion gave different results), but obviously they changed some settings in their calculator.

BjornFJohansson · 2024-07-01T06:24:02Z

Yes, I get wildly different results right now. I would like to mimic the behaviour of the NEB calculator, the source code is not easy to follow though.

manulera · 2024-07-16T13:10:10Z

I had a look at the js code and it's compiled by babel, so I don't think it is possible to extract the source code to reverse-engineer without spending a lot of time. Some possible solutions to consider @BjornFJohansson @hiyama341:

Find a first approximation using the default tm calculator, then call the api
Use asynchronous calls to the API when running several primers in parallel

They would both take some thinking, but they seem feasable, so it's a matter of how much is this slowing you down.

manulera · 2024-07-16T14:18:42Z

@hiyama341 I went ahead and made a first version using the mentioned approximation (first with the default pydna, then with the one from neb, and it's faster):

dev_bjorn...issue_237_optimisation

The file example.py compares the two approaches for finding a primer and you can see that the approximation is faster and uses fewer requests. This is what it prints. Let me know what you think.

Also, I could not use the teemi functions to make the request with "normal" headers. I was getting 403 response from the server. I had to mimmic a browser request to be able to use the API. Not sure if you encounter these problems. I could not even run their curl example. See the headers here.

making a request
making a request
making a request
making a request
making a request
result atgtcgtATGaaaccgttatcgatcatatgtGcgaaatgtcgcgcgtcatctacgtatcatcgatctactTAAacgtgta
--- 1.4502029418945312 seconds ---
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
making a request
result atgtcgtATGaaaccgttatcgatcatatgtGcgaaatgtcgcgcgtcatctacgtatcatcgatctactTAAacgtgta
--- 6.762638092041016 seconds ---

hiyama341 · 2024-07-16T17:48:23Z

@manulera and @BjornFJohansson great work!

Yeah, two days ago I started getting the 403 response from the server. They must have changed something cause it was working just fine before. And thanks I'm gonna try to add the headers to teemi and see if it works.

Regarding the workload, maybe we can discuss on the 5th pydna meeting whether it is worth to reverse-engineer the primer calc function. Personally, I think it would be useful to have something fast and stable when we are calculating/making hundreds of primers but let's discuss next week.

BjornFJohansson · 2024-07-17T07:14:18Z

I agree with @hiyama341 on this one. Localized Tm code would be very useful and I suspect would convince potential users.

manulera · 2024-07-17T12:10:20Z

I also agree that ideally we would want the code in the library and not rely on an external API that may cut access like they just did.

The issue is that it's not as simple as "translating the javascript code into python". The javascript code we can access from their website is bundled code. This is generated by javascript libraries like Babel, which basically make a single javascript file with all the code for the page, without dependencies on external files or libraries, and maximizing browser compatibility. This means that the source code that people in NEB wrote looks very different (and much more understandable) that what is in that file which is not meant to be read by humans, so reverse-engineering would take a lot of time and knowledge of javascript.

MarkusPiotrowski · 2024-07-17T12:15:31Z

Why don't you use Biopython's MeltingTemp and find the correct settings to simulate NEB's TM calculator? Isn't Biopython a library that is likely to be used together with PyDNA?

manulera · 2024-07-17T13:04:08Z

Hi @MarkusPiotrowski NEB's calculator gives different values depending on the polymerase that will be used, and overall does a more complex calculation from what the others said. When I was doing PCR I was never too concerned with exact Tm calculations but it seems some people trust NEB's calculator over anything else and would exclude tools that use other Tm calculations.

MarkusPiotrowski · 2024-07-17T14:50:34Z

@manulera I'm the author of Biopython's MeltingTemp. There are several options that are usually applied for calculating Tm's with the nearest neighbor thermodynamic method:

Initialization (different methods are used in literature).
"Zipping", again different tables with thermodynamic data are available.
Salt correction. There are different methods and, of course, you can set different values.

I would wonder if NEB uses a "more sophisticated" method and if their results can't be simulated with Biopython's MeltingTemp. I guess it's a question of finding the right parameters, and there are numerous possibilities for them. And of course they may have different parameters for different enzymes.

Their own help page gives some details (if they stick to them and is still up-to-date): https://tmcalculator.neb.com/#!/help
E.g. they write that they use different salt correction method for Phusion.

BjornFJohansson · 2024-07-17T18:56:27Z

@MarkusPiotrowski @manulera @hiyama341 Please have a look at this comparison Biopython vs NEB Seems like there is a ~7 degree difference for a large list of primers. I replicated the behaviour as best I could given the sometimes incomplete information given by NEB.

BjornFJohansson · 2024-07-17T20:00:46Z

In this repo they somehow run the NEB Tm Calculator from Python.

https://github.com/kalekundert/ligrna

MarkusPiotrowski · 2024-07-18T11:26:16Z

@BjornFJohansson Thanks for the pointers tho these pages.

For Taq polymerase I now managed to simulate NEB's Tm calculator with Biopython's MeltingTemp. In NEB's calculator, they obviously do not correct for Mg ions. So the Mg2+ concentrations and dNTPs concentrations are irrelevant for their calculation. Thus, we need to remove them from the call to Tm_NN (when Mg is 0 [default], dNTPs are not considered). Then it fits nicely:

import Bio
from Bio.SeqUtils import MeltingTemp as mt

primers = (
    'AGCGGATAACAATTTCACACAGGA',
    'GTAAAACGACGGCCAGT',
    'AGCGGATAAGGGCAATTTCAC',
    'GTAAAACGACGGCCA',
    )

NEBtms = (58, 54, 57, 50) # from NEB Tm calculator

for temp, primer in zip(NEBtms, primers):
    tm = mt.Tm_NN(
        primer,
        nn_table=mt.DNA_NN3,
        Na=0,  # mM
        K=50,  # mM
        Tris=10,  # mM
        # Mg=1.5,
        dnac1=200,  # nM
        dnac2=0,  # nM
        dNTPs=0.8,  # mM
        saltcorr=6,
    )
    print(
        f'Primer: {primer}, Tm: NEB: {temp}, Biopython: {round(tm)}'
        f' delta: {round(tm - temp, 2)}'
    )

Result:

Primer: AGCGGATAACAATTTCACACAGGA, Tm: NEB: 58, Biopython: 58 delta: -0.39
Primer: GTAAAACGACGGCCAGT, Tm: NEB: 54, Biopython: 54 delta: -0.04
Primer: AGCGGATAAGGGCAATTTCAC, Tm: NEB: 57, Biopython: 57 delta: -0.05
Primer: GTAAAACGACGGCCA, Tm: NEB: 50, Biopython: 50 delta: 0.36

The deltas are of course because NEB's calculator already gives rounded values.

As I said, one need to find the 'correct' settings. I'll go on to check for Phusion and Q5.

MarkusPiotrowski · 2024-07-18T11:49:49Z

I can't find the composition of Phusions HF buffer, however, looking on the assay conditions and some assumption, I would suggest the following settings:

# For Phusion, use this settings
tm = mt.Tm_NN(
        primer,
        nn_table=mt.DNA_NN3,
        Na=0,  # mM
        K=50,  # mM
        Tris=25,  # mM
        Mg=1.5,
        dnac1=200,  # nM
        dnac2=0,  # nM
        dNTPs=0.8,  # mM
        saltcorr=1,
)

These give values that derive from the rounded NEB data by about 0.2 °C.

As written in their Help page, for Phusion they use a different salt correction method and, obviously, there they include Mg and dNTPs concentrations in their calculation.

MarkusPiotrowski · 2024-07-19T13:43:13Z

These settings fit for Q5:

tm = mt.Tm_NN(
        primer,
        nn_table=mt.DNA_NN3,
        Na=0,  # mM
        K=50,  # mM
        Tris=10,  # mM
        Mg=1.5,
        dnac1=200,  # nM
        dnac2=0,  # nM
        dNTPs=0.8,  # mM
        saltcorr=6,
    )

BjornFJohansson · 2024-07-19T15:56:57Z

Thank you all for your contributions. Ill look into it asap.

MarkusPiotrowski · 2024-07-19T16:10:54Z

Let me add a little bit to the discussion:

While I could provide settings to simulate quite well the Tm values of NEB's Tm calculator (see above, fine for Taq and Q5, some small deviation with Phusion sometimes), I wonder if this is a goal you want to go for.
Why is NEB the source you want to trust? As already mentioned in the introduction by @BjornFJohansson , FisherScientifics' Tm calculator gives largely (3-4 °C) different results for Phusion than NEB's calculator. These calculators may also change (well, they changed in the past), you never know if they do the calculation correct (if you just want to simulate you may want to ignore this), and it's hard to find the correct parameters. And these may also change in the future.

When I rewrote Biopython's MeltingTemp code some years ago, I also compared the results to other calculators (mostly to check if I do the math correctly), but I focused on established academic tools, at that time this was MELTING and Primer3 or Primer3Plus. Online calculators from companies were hard to compare with because except some general statements ("using nearest neighbor.... according to...."), detailed information about the exact usage of the algorithm (which thermodynamic tables do they use, how do they calculate the initialization, which salt correction algorithm do they use) were sparse.

BTW, this source https://github.com/kalekundert/ligrna (mentioned by @BjornFJohansson ) seems to contain the pre-2016 code? (it mentions Breslauer, that is not mentioned in the recent pages).

Also, I browsed through the Javascript and while I can find the calculations and published constants, I don't find the actual buffer salt concentrations.

manulera · 2024-07-22T08:45:53Z

Hi @MarkusPiotrowski thanks for testing, it is very appreciated.

Elaborating a bit more on why we were trying to match NEB's results: As far as I understand it, the reason for wanting to reproduce the NEB's calculations is not so much to achieve the most "thermodynamically correct" results. Instead, from what @hiyama341 was saying, some researchers trust NEB over other calculators because they have been using it for a long time, so it's more an issue of "habit" rather than accuracy. Because of this, they would not trust/use a tool that uses other calculators and this could be an issue for some of the tools built on top of pydna that we are working on, namely teemi and ShareYourCloning.

MarkusPiotrowski · 2024-07-22T10:44:24Z

@manulera Thank you for the explanation. You know your users better than I, so you can probably be convinced that they prefer NEB's calculator. Some years ago we had a Promega (now Thermo) freezer in the neighboring department, so we got our enzymes usually from them, and at that time I would have preferred to use their calculator.

What I would sum up from my testing and also by taking a look at the Javascript code from NEB's calculator:
I don't see any magic in the JS code and it should be possible to use Biopython's MeltingTemp to simulate their results. Interestingly, the code contains much more thermodynamic tables and formulas than they claim to use (in fact, I wouldn't wonder if they use Biopython's MeltingTemp as template).

Above you have the settings that work well. Maybe some hints:

Na, K and (Tris/2) sum up to give one 'monovalent' salt concentration. Mg may be added to that as follows: 120*([Mg2+] - [dNTPs])^0.5, but only if Mg > dNTPS
This means that you don't need to know the exact composition of the buffer. As long as you know the thermodynamic data set and the salt correction formula (and this we do), you could run a test loop with just different numbers for Na until you got the smallest deviation from the NEB results for a data set of primers.
If you want to simulate other calculators, you should know that many of them use the primer concentration in a wrong way. Often they use [primer]/4, but this is only true if you were hybridizing primers to themselves ([primer1] = [primer2]). This can be mimicked by Biopython's MeltingTemp if you set dnac1 = dnac2 = primer/2

So I would suggest, to use a small wrapper around Biopython's Meltingtemp and supply your users with different predefined sets of settings, e.g., NEB_Taq, NEB_Phusion, NEB_Q5 etc.

MarkusPiotrowski · 2024-07-22T12:24:10Z

@BjornFJohansson There is an error in https://github.com/BjornFJohansson/tm/blob/master/NEB%20Phusion/NEB%20Phusion.ipynb . You gave the dNTPs in µmolar instead of millimolar. Thus free Mg becomes 0 and ist not longer taken into consideration for calculations.

BjornFJohansson · 2024-07-22T14:07:02Z

Thanks, Ill look into it. I am compiling results for the NEB Phusion which worked perfectly.

manulera · 2024-07-22T14:11:13Z

By the way @MarkusPiotrowski, we will meet this Thursday 25th of July at 14:00 UK time to discuss some issues of pydna, including this one. If you want to join the discussion, your insight would be appreciated! Below a Teams link for that:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_MTk0N2UxNTktODQ4Yy00OGQyLWFiNmEtNzZhNTZhMjMyNzVk%40thread.v2/0?context=%7b%22Tid%22%3a%221faf88fe-a998-4c5b-93c9-210a11d9a5c2%22%2c%22Oid%22%3a%22e1394e78-cf0f-492d-9394-b06f66175536%22%7d

BjornFJohansson · 2024-07-23T14:25:21Z

Dear all, Some updated comparisons between neb calculator and the different combinations of biopython options. Specifically, I made a sum of least squares minimization as suggested using Scipy.minimize for the Q5 polymerase.

https://github.com/BjornFJohansson/tm/blob/c38c6c07e73ba2f8036b19293a5a58af905db5e8/NEB%20Q5/NEB%20Q5.py

I get close, but not as close as with some of the others (thanks, @MarkusPiotrowski for the help).

There seems to be some creative mixtures of thermodynamic data in the neb tm calculator source code. The ones I could find are extracted to Python in the NEB_website folder.

Salt correction seems to be the same as biopython 7 (Owczarzy 2008) and not Owczarzy 2004 as documented.

manulera · 2024-07-31T09:38:09Z

Hello, I was working on #250 and I noticed that sometimes longer primers give lower Tms, is this normal?

from pydna.tm import tm_default
print(tm_default("agatcgactatctacttatgcatctta"))  # 58.90561619071127
print(tm_default("agatcgactatctacttatgcatctt"))  # 59.1327553388262

hiyama341 · 2024-09-05T07:42:35Z

Hi all,
I didn't mention this but I tried recreating this with brute force but couldn't get closer than Björn.
Good idea to put it on the Hackathon list @manulera :)

manulera added a commit that referenced this issue Jul 16, 2024

optimisation for #237

8b12daf

BjornFJohansson closed this as completed Jul 23, 2024

BjornFJohansson reopened this Jul 23, 2024

BjornFJohansson pushed a commit that referenced this issue Aug 8, 2024

optimisation for #237

0cc98b0

manulera changed the title ~~Expanding the oligo melting temperature calculations~~ Reproducing the NEB Tm calculator behavior in pydna Sep 5, 2024

manulera added the Hackathon-2024 label Sep 5, 2024

manulera added a commit that referenced this issue Oct 8, 2024

optimisation for #237

2c92f13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing the NEB Tm calculator behavior in pydna #237

Reproducing the NEB Tm calculator behavior in pydna #237

BjornFJohansson commented Jun 25, 2024 •

edited

Loading

manulera commented Jun 25, 2024

BjornFJohansson commented Jun 27, 2024 •

edited

Loading

BjornFJohansson commented Jun 28, 2024 •

edited

Loading

MarkusPiotrowski commented Jun 30, 2024

BjornFJohansson commented Jul 1, 2024

manulera commented Jul 16, 2024

manulera commented Jul 16, 2024 •

edited

Loading

hiyama341 commented Jul 16, 2024

BjornFJohansson commented Jul 17, 2024

manulera commented Jul 17, 2024

MarkusPiotrowski commented Jul 17, 2024

manulera commented Jul 17, 2024 •

edited

Loading

MarkusPiotrowski commented Jul 17, 2024

BjornFJohansson commented Jul 17, 2024 •

edited

Loading

BjornFJohansson commented Jul 17, 2024

MarkusPiotrowski commented Jul 18, 2024

MarkusPiotrowski commented Jul 18, 2024 •

edited

Loading

MarkusPiotrowski commented Jul 19, 2024

BjornFJohansson commented Jul 19, 2024

MarkusPiotrowski commented Jul 19, 2024

manulera commented Jul 22, 2024

MarkusPiotrowski commented Jul 22, 2024

MarkusPiotrowski commented Jul 22, 2024

BjornFJohansson commented Jul 22, 2024

manulera commented Jul 22, 2024

BjornFJohansson commented Jul 23, 2024 •

edited

Loading

manulera commented Jul 31, 2024 •

edited

Loading

hiyama341 commented Sep 5, 2024

Reproducing the NEB Tm calculator behavior in pydna #237

Reproducing the NEB Tm calculator behavior in pydna #237

Comments

BjornFJohansson commented Jun 25, 2024 • edited Loading

manulera commented Jun 25, 2024

BjornFJohansson commented Jun 27, 2024 • edited Loading

BjornFJohansson commented Jun 28, 2024 • edited Loading

MarkusPiotrowski commented Jun 30, 2024

BjornFJohansson commented Jul 1, 2024

manulera commented Jul 16, 2024

manulera commented Jul 16, 2024 • edited Loading

hiyama341 commented Jul 16, 2024

BjornFJohansson commented Jul 17, 2024

manulera commented Jul 17, 2024

MarkusPiotrowski commented Jul 17, 2024

manulera commented Jul 17, 2024 • edited Loading

MarkusPiotrowski commented Jul 17, 2024

BjornFJohansson commented Jul 17, 2024 • edited Loading

BjornFJohansson commented Jul 17, 2024

MarkusPiotrowski commented Jul 18, 2024

MarkusPiotrowski commented Jul 18, 2024 • edited Loading

MarkusPiotrowski commented Jul 19, 2024

BjornFJohansson commented Jul 19, 2024

MarkusPiotrowski commented Jul 19, 2024

manulera commented Jul 22, 2024

MarkusPiotrowski commented Jul 22, 2024

MarkusPiotrowski commented Jul 22, 2024

BjornFJohansson commented Jul 22, 2024

manulera commented Jul 22, 2024

BjornFJohansson commented Jul 23, 2024 • edited Loading

manulera commented Jul 31, 2024 • edited Loading

hiyama341 commented Sep 5, 2024

BjornFJohansson commented Jun 25, 2024 •

edited

Loading

BjornFJohansson commented Jun 27, 2024 •

edited

Loading

BjornFJohansson commented Jun 28, 2024 •

edited

Loading

manulera commented Jul 16, 2024 •

edited

Loading

manulera commented Jul 17, 2024 •

edited

Loading

BjornFJohansson commented Jul 17, 2024 •

edited

Loading

MarkusPiotrowski commented Jul 18, 2024 •

edited

Loading

BjornFJohansson commented Jul 23, 2024 •

edited

Loading

manulera commented Jul 31, 2024 •

edited

Loading