MathGroup Archive 2009

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: ChemicalData[], SMILES, EdgeRules

  • To: mathgroup at smc.vnet.net
  • Subject: [mg104063] Re: ChemicalData[], SMILES, EdgeRules
  • From: Mike H <mike.honeychurch at gmail.com>
  • Date: Sat, 17 Oct 2009 07:04:26 -0400 (EDT)
  • References: <hb1ntf$son$1@smc.vnet.net> <200910140319.XAA22431@smc.vnet.net>

Firstly tallying elements:

Clear[elementCount]

elementCount[smiles_String, elements_List] := Module[{tmp},

  tmp = {#, StringCount[smiles, #]} & /@ elements
  ]

In[1]:=elementCount["CC1=CCC2CC1C2(C)C", {"C", "N"}]

Out[1]={{"C", 10}, {"N", 0}}

If you have a file with your SMILES data you would need to convert it to a
list (Import[smiles.txt,"Data"] should do it) and then you can map that onto
elementCount

smilesList=Import[smiles.txt,"Data"];

elementCount[#,{"C","O","N" etc.}]&/smilesList

As for some of the other things you mentioned in your post. I doubt that
they can be done because, as I said, it requires ChemicalData to recognise
graphing rules as being a specific chemical and it is not designed to do
that.

Mike

ps. do you also use Olivetol with Alphapinene?



On Fri, Oct 16, 2009 at 3:24 PM, Scot Martin <smartin at seas.harvard.edu>wrote:

> Mike,
>
> Thank you for your input on this. Here's the further information you asked
> about, most grateful if you can be helpful. First, what I am trying to do
> essentially? I have about 500 SMILES strings in a text file and I want to
> get the element count in each (i.e., number of carbon, oxygen, etc.). So,
> my
> plan was, "ImportString[] with SMILES and then use 'ElementTally' of
> ChemicalData[]." I was surprised, however, that I couldn't convert the
> SMILES data from ImportString[] into a format for ChemicalData[ ]. Here is
> some more specific information:
>
> In[1]:= ImportString["CC1=CCC2CC1C2(C)C", "SMILES"]
>
> Out[1]= {"EdgeRules" -> {{1 -> 2, 2 -> 3, 2 -> 7, 3 -> 4, 4 -> 5,
>    5 -> 6, 5 -> 8, 6 -> 7, 7 -> 8, 8 -> 9, 8 -> 10}},
>  "EdgeTypes" -> {{"Single", "Double", "Single", "Single", "Single",
>    "Single", "Single", "Single", "Single", "Single", "Single"}},
>  "FormalCharges" -> {{0, 0, 0, 0, 0, 0, 0, 0, 0, 0}},
>  "VertexTypes" -> {{"C", "C", "C", "C", "C", "C", "C", "C", "C",
>    "C"}}}
>
> In[2]:= GraphPlot[First["EdgeRules" /. %]]
>
> Out[2]:= :: graphics ::
>
>
> For comparison, this molecule is alpha-pinene:
>
> In[3]:= ChemicalData["AlphaPinene"]
>
> Out[3]:= :: graphics ::
>
>
> In[4]:= ChemicalData["AlphaPinene", "ElementTally"]
>
> Out[4]= {{"C", 10}, {"H", 16}}
>
>
> In[5]:= ChemicalData["AlphaPinene", "SMILES"]
>
> Out[5]= "CC1=CCC2CC1C2(C)C"
>
> But how to get ImportString to communicate to ChemicalData[] that I mean
> "AlphaPinene"?
>
> Scot
>
> -----Original Message-----
> From: Armand Tamzarian [mailto:mike.honeychurch at gmail.com]
> Sent: Thursday, October 15, 2009 07:19
> To: mathgroup at smc.vnet.net
> Subject: [mg104012] Re: ChemicalData[], SMILES, EdgeRules
>
> On Oct 14, 6:59 am, "Scot T. Martin" <smar... at seas.harvard.edu> wrote:
> > I have used ImportString[%,"SMILES"] to obtain the following from
> > Mathematica:
> >
> > {"EdgeRules" -> {{1 -> 2, 2 -> 3, 2 -> 4, 4 -> 5, 4 -> 7, 5 -> 6}},
> >   "EdgeTypes" -> {{"Single", "Double", "Single", "Single", "Single",
> >      "Single"}}, "FormalCharges" -> {{0, 0, 0, 0, 0, 0, 0}},
> >   "VertexTypes" -> {{"C", "C", "O", "C", "C", "O", "O"}}}
> >
> > My question is: how do I next use this information? I would like to have
> > access to all of the ChemicalData[] functionality, such as drawing a
> thre=
> e
> > dimesional structure, molecular weight calculation, "ElementTally", and
> s=
> o
> > forth.
> >
> > However, I cannot find any way that Mathematica will recognize the Rules
> > list produced as output from ImportString as an input for chemical
> > structure for use by ChemicalData[].
> >
> > Anyone have hints?
>
> Can you provide more details? For started when you use GraphPlot with
> those rules you don't get a molecular structure you get 4 unconnected
> lines so I think the file you imported to generate those rules is
> unusual.
>
> ref/format/SMILES
>
> ChemicalData[] can be used to produce SMILES information but it
> doesn't follow -- at least as far as I know -- that given the SMILES
> information you can then identify the molecule in ChemicalData[] which
> is what you appear to be wanting to do (???)
>
> Mike
>
>
>


  • Prev by Date: Re: ChemicalData[], SMILES, EdgeRules
  • Next by Date: Re: Code folding and wrapping using the input->code method?
  • Previous by thread: Re: ChemicalData[], SMILES, EdgeRules
  • Next by thread: Re: Mathematica 7.01 and Mac OS 10.6 (Snow Leopard)