<?xml version="1.0" encoding="iso-8859-1"?> 
<rss version="2.0">
<channel>
  <title>Genome Blog [Category - Bioinformatics]</title> 
  <description>Blog Description [Category - Bioinformatics]</description>
  <link><![CDATA[ http://genomealberta.ca/blogs/default.aspx ]]></link> 
  <language>en-us</language> 
  <pubDate>Thu, 10 May 12 22:21:46 UT</pubDate> 
  <lastBuildDate>Thu, 10 May 12 22:21:46 UT</lastBuildDate> 
  <docs>http://blogs.law.harvard.edu/tech/rss</docs> 
  <generator>Marqui 6.0</generator> 
  <managingEditor>System Administrator</managingEditor> 
  <webMaster>System Administrator</webMaster> 
  <item><title>Paul Gordon on Taverna</title><link>http://genomealberta.ca/blogs/paul-gordon-on-taverna.aspx</link><description><![CDATA[Guest post by Susanne Cardwell <BR>
    <BR>
    <BR>
    With over 40 publications to his name and over 100 citations as first author, Paul Gordon is a senior bioinformatician at the<A href="http://www.visualgenomics.ca/" target=_blank><STRONG>Sun Center of Excellence for Visual Genomics</STRONG></A>. He is about to defend his Ph.D. thesis on the topic of the Semantic Web. Susanne Cardwell holds an M.A. in Communications Studies. She is the coordinator for the Applied Computational Genomics Course (<A href="http://www.gcbioinformatics.ca/training">www.gcbioinformatics.ca/training</A> ) and a colleague of Paul&rsquo;s at the Sun Center of Excellence for Visual Genomics. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <STRONG>What is Taverna? <BR>
    </STRONG>Paul Gordon: Taverna is software that helps you automate bioinformatics analysis through the use of workflow diagrams. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <STRONG>How does Taverna work? <BR>
    </STRONG>Paul Gordon: By editing these workflow diagrams, you can create custom analysis pipelines. These diagrams represent the flow of data (i.e., DNA sequences, protein names, gene expression experiments, etc.) from one Web based analysis service to another to complete a more comprehensive analysis than any one Web site could provide. <BR>
    <BR>
    <STRONG>Susanne Cardwell: What are the workflow diagrams you start with?</STRONG> <BR>
    Paul Gordon: At first you start with a blank canvas and basically you build up these diagrams by dragging and dropping components on the canvas. These components correspond to analysis services provided by different Web sites. <BR>
    <BR>
    <STRONG>Susanne Cardwell: How does Taverna interpret these Web sites? <BR>
    </STRONG>Paul Gordon: These remote Web sites provide an interface that other computers can easily programmatically access; these are called Web Services. Web Services are to automated analysis as Web forms are to manual analysis people do. <BR>
    <BR>
    <STRONG>Susanne Cardwell: What are some of the special features of Taverna? <BR>
    </STRONG>Paul Gordon: Taverna is one of the few programming tools that doesn&rsquo;t require you to write any low level syntax. Most of the programming is point-and-click. <BR>
    <BR>
    <STRONG>Susanne Cardwell: What computer specifications are required to access Taverna? <BR>
    </STRONG>Paul Gordon: Taverna works on most computers as long as you have a recent version of Java installed. Because most of the computation is being done through calls to remote Web Services, you don&rsquo;t need a powerful computer to execute them. <BR>
    <BR>
    <STRONG>Susanne Cardwell: How does Taverna better facilitate effective biological/genetic research? <BR>
    </STRONG>Paul Gordon: Taverna lowers the barrier to analysis automation for people who don&rsquo;t have a background in traditional programming. The workflow diagrams themselves are very communicative in terms of papers when you want to document methods and allow others to reproduce your results. <BR>
    <BR>
    <STRONG>Susanne Cardwell: What kinds of output does Taverna provide? <BR>
    </STRONG>Paul Gordon: This depends on which components you used in your workflow. Web Services tend to produce data in a format called XML, but it is also possible to include components in your workflow that reformat data suitable for Excel. There are a lot of other utilities built in, too numerous to name. You can also include small R scripts, which are useful for people who do a lot of statistical analysis. R is a programming language for statistics. <BR>
    <BR>
    <STRONG>Susanne Cardwell: How difficult is it for a beginner to learn Taverna? <BR>
    </STRONG>Paul Gordon: This depends on the amount of hubris that the beginner has. The easiest way to get started is to load an existing workflow and then run it if it does what you want already or edit it if it just needs some tweaking. As you get more experience using the workflow editor, you&rsquo;ll be more comfortable generating workflows from scratch. You can find existing workflows at the myExperiment.org Web site, or you can generate workflows from example manual analysis using Seahawk. Workflows are formal visual representations of sequential analysis pipelines (i.e., think arrows connecting boxes). <BR>
    <BR>
    <STRONG>Susanne Cardwell: What resources would you recommend for learning Taverna? <BR>
    </STRONG>Paul Gordon: I would recommend the Applied Computational Genomics Course (www.gcbioinformatics.ca/training) as it teaches an introduction to both Seahawk and Taverna. <BR>
    <BR>
    <BR>]]></description><pubDate>Sun, 20 Mar 11 17:00:00 UT</pubDate></item><item><title>Introduction to Magpie</title><link>http://genomealberta.ca/blogs/introduction-to-magpie.aspx</link><description><![CDATA[Guest post by Susanne Cardwell <BR>
    <BR>
    <BR>
    Mostafa Abdellateef is a Bioinformatics Programmer at the <A href="http://www.visualgenomics.ca/" target=_blank><STRONG>Center of Excellence for Visual Genomics</STRONG></A>, located at the University of Calgary. He holds a B.Sc. degree from Ryerson University and completed a Post-Graduate work in Bioinformatics at Seneca College. Susanne Cardwell, coordinator of the Applied Computational Genomics Course (Web accessible at <A href="http://www.gcbioinforrmatics.ca/training"><STRONG>www.gcbioinforrmatics.ca/training</STRONG></A> ) holds an M.A. in Communications Studies. <BR>
    <BR>
    <STRONG>What is Magpie? <BR>
    </STRONG>MAGPIE is actually an acronym which stands for: Multi-purpose Automated Genome Project Investigation Environment. More specifically, MAGPIE is a software package that offers automated curation and presentation of DNA and protein sequences. <BR>
    <BR>
    <STRONG>Who uses Magpie? <BR>
    </STRONG>Researchers studying DNA and/or protein sequences use Magpie.]]></description><pubDate>Sun, 06 Mar 11 06:30:00 UT</pubDate></item><item><title>Paul Gordon on Computer Languages</title><link>http://genomealberta.ca/blogs/paul-gordon-on-computer-languages.aspx</link><description><![CDATA[<BR>
    Guest post by Susanne Cardwell <BR>
    <BR>
    <BR>
    Paul Gordon is a Ph.D. Candidate in Computer Science at the University of Calgary with over 40 publications to his name. He presently is employed with the&nbsp;<A href="http://www.visualgenomics.ca/" target=_blank><STRONG>Sun Center of Excellence for Visual Genomics</STRONG></A> as a senior bioinformatician. Susanne Cardwell holds an M.A. in Communications Studies and coordinates the Applied Computational Genomics Course in two locations throughout Canada per year (<STRONG><A href="http://www.gcbioinformatics.ca/training">www.gcbioinformatics.ca/training</A> </STRONG>). <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>What are the three most powerful computer languages and for what reasons?</EM> <BR>
    <STRONG>Paul Gordon:</STRONG> It depends on how you define powerful. For the computer geeks, most programming languages are Turing-complete. Essentially, what that means is that you could write any functionality in any of the languages. It is just more convenient in some than others because of the syntax and code libraries available in different languages. With that being said, for professional programmers, the most popular languages are probably variants of C, Java, and a scripting language such as Perl, Python, or Ruby. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <EM>What are the top three computer language for geneticists/ biologists and for what reasons? <BR>
    </EM><STRONG>Paul Gordon</STRONG>: Generally, geneticists/biologists are most comfortable with the syntax of scripting languages, such as the aforementioned Perl, Python, and Ruby. Not only is the syntax friendlier because they are scripting languages, but also you don&rsquo;t need to use a compiler to see if your programs work. You call an interpreter instead. It eliminates a step in the development process. Additionally, there are large bioinformatics-specific code libraries for these languages that give you a leg up in whatever analyses you need to automate. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <EM>How long does it take to learn a language like, for instance, Perl, and what circumstances would need to be in place to learn at that rate?</EM> <BR>
    <STRONG>Paul Gordon:</STRONG> I&rsquo;m still learning stuff, and I&rsquo;ve been coding in Perl for fifteen years. But, to get to a basic level of proficiency, I&rsquo;d say six months of regular exposure to the language is needed. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <EM>After learning the basics of one language, how long does it take to learn the basics of a second language? <BR>
    </EM><STRONG>Paul Gordon:</STRONG> I would say normally significantly less time than the first language because you can carry over your knowledge of programming concepts and only need to learn the syntactic particulars of the language. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:<EM> Once you learn the basics in Python, what can you do in genetics research that you might not have been able to do before, assuming you are a biologist/geneticist? <BR>
    </EM><STRONG>Paul Gordon:</STRONG> Programming is all about automation. The most useful construct for a non-programmer to learn at first is the &ldquo;for&rdquo; loop. This will allow you to iterate over large sets of data and do your analysis much faster and less monotonously than manually doing the same thing for long lists of data (for example, a list of 100 genes you got from a microarray experiment or a list of 30 potential interacting proteins for your gene of interest). <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <EM>To incorporate bioinformatics into your research, would you need to know more than just Python or Perl? What other languages or programs would you need to learn in order to do effective research? <BR>
    </EM><STRONG>Paul Gordon:</STRONG> Well, of course, there is a lot of bioinformatics that happens without automation. It is important to know the existing programs that you can use as building blocks for your analysis &ndash; for example, what databases are available that might be relevant to your research? Also, what kinds of computational tools can assist you in evaluating your data, whether they be statistical, predictive, or evaluative? Generally, you will write programs that glue these building blocks together rather than rewriting Blast or something of the sort. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG><EM><STRONG>:</STRONG> MIT has an OpenCourseWare project that offers training in Python (video tutorials, readings, assignments, and so forth at no cost to the user). What other resources are available to learn computer languages and programs for bioinformaticians? <BR>
    </EM><STRONG>Paul Gordon</STRONG>: Most programmers are still fans of books, believe it or not. There are plenty of good books for all of the programming languages that I have mentioned &ndash; just be sure to read the Amazon reviews.&nbsp;<A href="http://oreilly.com/" target=_blank><STRONG>O&rsquo;Reilly books</STRONG></A>&nbsp; are particularly well regarded and will act as a reference for you for many years (i.e., you don&rsquo;t have to memorize any functions). That being said, there is not substitute for being able to interact with experienced programmers while you are taking the first steps in programming. The Applied Computational Genomics Course (<STRONG><A href="http://www.gcbioinformatics.ca/training">www.gcbioinformatics.ca/training</A> </STRONG>) provides an introduction to Perl that can really help non-programmers gain the fundamental knowledge and confidence required to set them on the path to self-sufficiency. <BR>]]></description><pubDate>Tue, 04 Jan 11 20:30:00 UT</pubDate></item><item><title>Paul Gordon on XML for Biologists</title><link>http://genomealberta.ca/blogs/paul-gordon-on-xml-for-biologists.aspx</link><description><![CDATA[Guest post by Susanne Cardwell <BR>
    <BR>
    <BR>
    Paul Gordon is a University of Calgary Ph.D. Candidate with over 40 publications to his name. He is presently employed at the <A href="http://www.visualgenomics.ca/" target=_blank><STRONG>Sun Center of Excellence for Visual Genomics</STRONG></A>, a bioinformatics lab spearheaded by Dr. Christoph Sensen. Susanne Cardwell holds an M.A. in Communications Studies. She coordinates the Applied Computational Genomics Course (<A href="http://www.gcbioinformatics.ca/training" target=_blank><STRONG>www.gcbioinformatics.ca/training</STRONG></A>), which provides biologists/geneticists with bioinformatics tools, including an introduction to Perl. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:&nbsp; <EM>What is XML?</EM> <BR>
    <STRONG>Paul Gordon</STRONG>: XML is a way to format data. It is widely used to exchange information on the web. They key is that you can create your own labels for data to reflect your knowledge domain. The advantage is that you don&rsquo;t have to write your own parser software to extract information from these documents. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:&nbsp;<EM>What is a parser?</EM> <BR>
    <STRONG>Paul Gordon</STRONG>: A parser is a piece of software the breaks down a data file into meaningful subsections that can be used. For example, web browsers have HTMLl parsers in them so that they can convert the contents of a Web address into what you see on the screen. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:&nbsp; <EM>What is the difference between HTML and XML?</EM> <BR>
    <STRONG>Paul Gordon</STRONG>: HTML is focused on describing natural language documents that people read, like a newspaper Web site or any kind of Web site for that matter, whereas XML is normally used to describe data records, for example, the 3-dimensional structure of a protein or the strokes in a vector graphic (a scalable picture). <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>Why is XML relevant for biologists? <BR>
    </EM><STRONG>Paul Gordon</STRONG>: XML is relevant for biologists because it is a very popular format for exchanging bioinformatics data. If you want to process large amounts of information from different Web sites, there are many software tools you can use to easily pick out the information you want from the XML files. For example, if you download records from the NCBI&rsquo;s Genbank database in XML format, you could easily retrieve all of the author&rsquo;s names from related publications for the given DNA or protein sequences. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>How is XML integrated into software applications for geneticists or biologists?</EM> <BR>
    <STRONG>Paul Gordon</STRONG>: Most software either internally uses or can export XML versions of the data you are working on. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>Is there a common language for XML or are there different languages?</EM> <BR>
    <STRONG>Paul Gordon</STRONG>: XML is a meta-format, so you can create dialects of XML. It is like there is one grammar but many different vocabularies used by different applications and fields of study. You can make up your own XML dialect, and it can be easily parsed through the common grammar. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>How does XML tie into the Semantic Web and what are the implications for the future for biologists? <BR>
    </EM><STRONG>Paul Gordon</STRONG>: XML is the first step toward a unified way to automatically access data on the web by having a common grammar; whereas, the Semantic Web is the next step where there is only one global vocabulary instead of many dialects. So, the implication is that it will be easier to answer biological questions using data from across the Web without needing programming when everybody speaks with the same vocabulary and grammar about their data. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>For what applications would a biologist need to know XML? <BR>
    </EM><STRONG>Paul Gordon</STRONG>: If the biologist wants to extract specific information from large databases and applications, but they have no interface to do it, then they could extract the information from the XML files that underlie the application of database. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>Could you please give an example of an XML script that a biologist would be able to use? <BR>
    </EM><STRONG>Paul Gordon</STRONG>: Biologists creating workflows (visual programs) in Taverna generate lots of XML data because the underlying services called use XML. Often you need to &ldquo;shim&rdquo; data from XML dialect of one service to the dialect of another. If you familiarize yourself with XML, this is pretty straightforward. <BR>]]></description><pubDate>Tue, 07 Dec 10 02:30:00 UT</pubDate></item><item><title>Scripting for Bioinformatics with Mostafa Adbellateef</title><link>http://genomealberta.ca/blogs/scripting-for-bioinformatics-with-mostafa-adbellateef.aspx</link><description><![CDATA[Guest post by Susanne Cardwell<BR>
    <BR>
    Mostafa Adbellateef is a bioinformatics programmer at the <A href="http://www.visualgenomics.ca/" target=_blank><STRONG>Sun Center of Visual Genomics</STRONG></A>, spearheaded by Dr. Christoph Sensen. Mostafa holds a Bachelor of Science in Biology and later graduated from Senaca College&rsquo;s bioinformatics postgraduate program in Toronto. Susanne Cardwell is the Training Coordinator for the Applied Computational Genomics Course ( <A href="http://www.gcbioinformatics.ca/training">http://www.gcbioinformatics.ca/training</A> ) that teaches bioinformatics skills, including beginning Perl, to biologists. She holds a Master of Arts in Communications Studies. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG>&nbsp; <EM>What examples of programming problems in bioinformatics might lead you to use C, Perl, Java, or SQL?</EM> <BR>
    <STRONG>Mostafa Abdellateef</STRONG>:&nbsp; You would want to use these languages to parse data from a result file, or if you wanted to visualize this data for example. Since the data sets are usually very large in bioinformatics, one might use these languages to organize the data for easier and or faster retrieval. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:&nbsp;&nbsp;<EM>Are pre-built programs like Bluejay and Magpie able to do many of the operations you require?</EM> <BR>
    <STRONG>Mostafa Adbellateef</STRONG>: With slight tweaking &ndash; yes. We continually modify the programs to fit the project's needs. You can download Magpie on your local system and make changes as it is open source. It is there for anyone to use and change as they see fit. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>Can you write up an example of a short script that would help address a simple problem in bioinformatics?</EM> <BR>
    <STRONG>Mostafa Abdellateef:</STRONG> Here is one: <BR>
    Perl &ndash;ne &lsquo;Begin{$/=&rdquo;&gt;&rdquo;} print if / seq_name/&rsquo; seq_file <BR>
    The above extracts the sequence name (seq_name) and the sequence from the FASTA sequence file (seq_file). <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:&nbsp; <EM>What problems in bioinformatics can Magpie address? GBrowse? BLAST?</EM> <BR>
    <STRONG>Mostafa Abdellateef</STRONG>: Magpie is quite versatile. Some of the problems it can address start from gene prediction to gene annotation. GBrowse is a genome visualization tool. For example, it can allow you to visualize your gene predictions. Or you can visualize several genomes if you want to examine their similarities or differences. BLAST or Basic Local Alignment Search Tool is used to find similarities in sequences. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:&nbsp; <EM>How would one begin to create web visualization and gene prediction pipelines tools through Perl, Javascript, HTML5, CGI, and AJAX?</EM> <BR>
    <STRONG>Mostafa Abdellateef</STRONG>: To create your own gene prediction pipeline, you would need an algorithm to identify the biologically functional sequences or genes. However, there already exists several gene prediction pipelines such as: Augustus, Genscan, and Mgene. They predict genes from a genomic sequence. To create a web based tool for visualization, you should begin by determining what it is you want to visualize. Then using your creativity you would build a web interface to visualize the data and add functions to manipulate this data as you see fit. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>:&nbsp; <EM>What resources are available for learning scripting and code for bioinformaticians to develop web visualization models and gene prediction pipelines? </EM><BR>
    <STRONG>Mostafa Abdellateef</STRONG>: The Applied Computational Genomics Course ( <STRONG><A href="http://www.gcbioinformatics.ca/training">www.gcbioinformatics.ca/training</A> </STRONG>) offered through the Center of Excellence is a good introductory bioinformatics course for learning scripting and coding &ndash; and for learning how to use various software tools. <BR>
    A book I would recommend is called &ldquo;Understanding Bioinformatics&rdquo; by Marketa J. Zvelebil. You would need a basic understanding of biology to benefit from this book. <BR>]]></description><pubDate>Tue, 23 Nov 10 03:30:00 UT</pubDate></item><item><title>Data Visualization with Paul Gordon</title><link>http://genomealberta.ca/blogs/data-visulaization-with-paul-gordon.aspx</link><description><![CDATA[<P style="MARGIN: 0cm 0cm 10pt"><SPAN><FONT face=Calibri><BR>
    Guest post by Susanne Cardwell <BR>
    <BR>
    <BR>
    Paul Gordon is&nbsp; a Ph.D. candidate at the University of Calgary with a resounding 40 plus publications finalized prior to his graduation.&nbsp; He is also an employee at the <A href="http://www.visualgenomics.ca/" target=_blank><STRONG>Sun Center of Excellence for Visual Genomics</STRONG></A>, spearheaded by Dr. Christoph Sensen.&nbsp;&nbsp;Paul focuses on software development and biological data analysis.&nbsp; </FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><SPAN><FONT face=Calibri>Susanne Cardwell is the Training Coordinator for the Applied Computational Genomics Course (</FONT><A href="http://www.gcbioinformatics.ca/training"><FONT color=#0000ff face=Calibri>www.gcbioinformatics.ca/training</FONT></A><FONT face=Calibri>) with an intense&nbsp; curiosity for bioinformatics programming.&nbsp; &nbsp;</FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><B><SPAN><FONT face=Calibri>Susanne Cardwell:</FONT></SPAN></B><SPAN><FONT face=Calibri> <EM>What is data visualization?<BR>
    </EM><B>Paul Gordon:</B> Data visualization is a generic term for ways of graphing or otherwise visually representing sets of numbers that in a scientific context represents experiments or natural phenomena.</FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><B><SPAN><FONT face=Calibri>Susanne Cardwell:</FONT></SPAN></B><SPAN><FONT face=Calibri> <EM>Why is data visualization important?<BR>
    </EM><B>Paul Gordon:</B> Because humans are very visually oriented, it becomes much easier to understand large data sets when you can visually group them, see trends and outliers in the data, etc., versus looking at tables of raw numbers.</FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><B><SPAN><FONT face=Calibri>Susanne Cardwell:</FONT></SPAN></B><SPAN><FONT face=Calibri> <EM>What is the role of data visualization in bioinformatics software and tools?<BR>
    </EM><B>Paul Gordon:</B> It is the most user friendly way to present the large dataset that bioinformatics analysis involves. It is an especially natural way to represent genomic data since the underlying DNA is a linear molecular structure (where, for example, the DNA is the x-axis in the graph).</FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><B><SPAN><FONT face=Calibri>Susanne Cardwell:</FONT></SPAN></B><SPAN><FONT face=Calibri> <EM>What type of computer programming software/design tools are required for developing data visualization?<BR>
    </EM><B>Paul Gordon:</B> Almost any programming language will contain graphing libraries, but some will have more pre-built items to use than others.&nbsp; If you were doing web-based visualization of data, then you r main options re the scripting languages, such as Perl, Ruby, and Python.&nbsp; With modern browsers becoming more sophisticated, dynamically changing a webpage, using just HTML and Javascript along with some AJAX libraries is sufficient.&nbsp; For desktop applications of graphics, C++ and Java are probably the most common.&nbsp; </FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><B><SPAN><FONT face=Calibri>Susanne Cardwell:</FONT></SPAN></B><SPAN><FONT face=Calibri> <EM>What kind of knowledge/personality does one need to have in order to create data visualization tools?&nbsp; To use data visualization for bioinformatics?<BR>
    </EM><B>Paul Gordon: </B>You&rsquo;ll need to learn a programming language like Java, Perl, Python, or Ruby.&nbsp; To use data models in bioinformatics there are projects that have made generic visualization tools for bioinformatics that you can customize with your own data and layout. Prime examples would be G-Browse for web based genomic data and Bluejay for desk-top visualization.</FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><B><SPAN><FONT face=Calibri>Susanne Cardwell:</FONT></SPAN></B><SPAN><FONT face=Calibri> <EM>How relevant is it for a biologist to include data visualization in his or her research?<BR>
    </EM><B>Paul Gordon:</B> It is very important because a picture is worth a thousand words and you only get so many words in an article.&nbsp; There are two aspects to it: One, it helps to communicate your research to others effectively and two, it can help you make sense of your own data in ways that might not be obvious from the raw data.</FONT></SPAN></P>
    <P style="MARGIN: 0cm 0cm 10pt"><B><SPAN><FONT face=Calibri>Susanne Cardwell:</FONT></SPAN></B><SPAN><FONT face=Calibri>&nbsp; <EM>Where would you recommend they get the skills for learning how to use/create data visualization models?&nbsp; <BR>
    </EM><B>Paul Gordon:</B> They should probably learn a little Perl and take the Applied Computational Genomics Course (</FONT><A href="http://www.gcbioinformatics.ca/training"><FONT color=#0000ff face=Calibri>www.gcbioinformatics.ca/training</FONT></A><FONT face=Calibri>) which will expose them to various data visualization tools.</FONT></SPAN></P>]]></description><pubDate>Thu, 28 Oct 10 13:15:00 UT</pubDate></item><item><title>An Interview with Mostafa Abdellateef his Career in Bioinformatics</title><link>http://genomealberta.ca/blogs/an-interview-with-mostafa-abdellateef-his-career-in-bioinformatics.aspx</link><description><![CDATA[<BR>
    Guest post by Susanne Cardwell<BR>
    <BR>
    Mostafa Adbellateef is a bioinformatics programmer at the <A href="http://www.visualgenomics.ca/" target=_blank>Sun Center of Visual Genomics</A>, spearheaded by Dr. Christoph Sensen. Susanne Cardwell is the Training Coordinator for the Applied Computational Genomics Course (<A href="http://www.gcbioinformatics.ca/training">http://www.gcbioinformatics.ca/training</A> ) that teaches bioinformatics skills, including beginning Perl, to biologists. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>How did you enter the field of bioinformatics? <BR>
    </EM><BR>
    <STRONG>Mostafa Abdellateef:</STRONG> I entered the field of bioinformatics in 2008. I stumbled upon a great program offered at Seneca College in Toronto. Seneca offered a post graduate program intended for people with a science background who want to gain some of the computational skills for bioinformatics. Seneca offers a one year program that involves a standardized microbiology course and a core bioinformatics course that includes a real-world project. I did my project at the Sick Kids Hospital, focusing on finding chromatin modification proteins. Other courses at Seneca College included C (programming language), a course specifically for Perl, a statistics course, and a great technical writing course. I took a Java course and a SQL course from Seneca as well. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <EM>What is the bioinformatics project you are currently working on? <BR>
    </EM><STRONG>Mostafa Adbellateef</STRONG>: The name of the project is Genozymes. The project stems from the University of Concordia. We are collaborating to find secreted proteins that are able to break down cellulose from fungi. Basically, cellulose is found in plants and trees, and the idea is to break down excess materials and turn them into simple sugars. Simple sugars can then be converted into energy or even things like LCD screens. <BR>
    <STRONG><BR>
    Susanne Cardwell:</STRONG> &nbsp;<EM>What does the bioinformatics project involve? <BR>
    </EM><STRONG>Mostafa Abdellateef:</STRONG> For me, it involves problem-solving, communication with the teams all over the world who are working on this project, and my direct bioinformatics skills learned in 2008. <BR>
    <BR>
    <STRONG>Susanne Cardwell</STRONG>: <EM>How does bioinformatics play into your project? <BR>
    </EM><STRONG>Mostafa Adbellateef:</STRONG> There is a key high throughput component in this project. We deal with large datasets. We are hoping to have 30 new fungal genomes sequenced and analyzed the end of the project. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> <EM>What software do you use to make your bioinformatics assessments?</EM> <BR>
    <STRONG>Mostafa Adbellateef:</STRONG> As a bioinformatics programmer, there are a number of bioinformatics tools available. Some of these include:
    <OL>
        <LI>GBrowse
        <LI>Magpie
        <LI>The Web to search public databases such as BLAST </LI>
    </OL>
    <P>A lot of the tools I use are ones I write myself. That is why they needed a programmer. Examples are web visualization and gene prediction pipelines. They are programmed using tools such as Perl, Javascript, HTML5, CGI, and AJAX. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG> &nbsp;<EM>What special knowledge do you require in order to perform as a bioinformatician?</EM> <BR>
    <STRONG>Mostafa Adbellateef:</STRONG> &nbsp;The special knowledge involves the ability to problem solve and think analytically. These skills develop with exposure to the bioinformatics process. Examples of bioinformatics problem-solving include figuring out why a software package is acting in a certain way or assessing a scientific problem, such as determining whether the occurrence you are observing is naturally occurring or whether it is a computational misrepresentation. Learning various software programs is another component of problem-solving. <BR>
    <BR>
    <STRONG>Susanne Cardwell:</STRONG>&nbsp; <EM>To whom would you re commend bioinformatics career? <BR>
    </EM><STRONG>Mostafa Abdellateef:</STRONG>&nbsp; I would recommend bioinformatics to someone who has a passion for science and who is tech savvy. Since I was a little kid, I liked computers. I went into the science and didn&rsquo;t marry the two (computers and biology) until three years after I graduated with my Molecular Biology degree. Bioinformatics is for people with a science background who consider themselves tech savvy. <BR>
    <BR>
    </P>]]></description><pubDate>Sat, 23 Oct 10 14:30:00 UT</pubDate></item><item><title>Interview with Paul Gordon on Perl</title><link>http://genomealberta.ca/blogs/interview-with-paul-gordon-on-perl.aspx</link><description><![CDATA[<P>Guest post&nbsp;by Susanne Cardwell&nbsp;<BR>
    <BR>
    <BR>
    <SPAN><FONT face=Calibri>Paul Gordon is a Bioinformatician at the University of Calgary&rsquo;s Sun Center of Excellence for Visual Genomics, spearheaded by Dr. Christoph Sensen.&nbsp; Paul is also a Ph.D. student finalizing his breakthrough work on the Semantic Web.&nbsp; He has been a leader in the development of the Semantic Web for biologists, speaking at conferences with some of the leading&nbsp;thinkers on the subject today.<BR>
    <BR>
    </FONT></SPAN><SPAN><FONT face=Calibri>Susanne Cardwell is the Administrative Coordinator with Genome Canada and subsequently with the Sun Center of Excellence for Visual Genomics.&nbsp; She coordinates the Applied Computational Genomics Course twice each year &ndash; a week long course -- which teaches Perl as one of its components &nbsp;(see <A href="http://www.gcbioinformatics.ca/training"><STRONG>www.gcbioinformatics.ca/training</STRONG></A> &nbsp;for more information). </FONT></SPAN></P>
    <P><SPAN><FONT face=Calibri></FONT></SPAN></P>
    <P><SPAN><FONT face=Calibri><STRONG>Susanne Cardwell:</STRONG>&nbsp; <EM>What is Perl?</EM></FONT></SPAN></P>
    <P><FONT face=Calibri><B>Paul Gordon:</B><SPAN>&nbsp; Perl is a programming language that is highly suitable for processing text-based data and processing web forms.</SPAN></FONT></P>
    <P><SPAN><FONT face=Calibri></FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell:</B>&nbsp; <EM>What is the role of Perl in bioinformatics?</EM> </FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon:</B><SPAN> Because biological data tends to be very text-based, Perl is a natural fit for processing .&nbsp; Examples of text-based data to process would include DNA sequence files, literature information, and so forth.</SPAN></FONT></P>
    <P><SPAN><FONT face=Calibri></FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell</B>: <EM>What is the difference between Perl and BioPerl?</EM></FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon:</B><SPAN> BioPerl is a package of extra modules that make it easy to manipulate data in all the different formats that different institutions use.&nbsp; Example of this are the NCBI formats or multiple sequence alignments, and whatever kind of manipulations the biologist needs to do with data, often combining or cross-referencing them.&nbsp;&nbsp;</SPAN></FONT><SPAN><FONT face=Calibri>&nbsp;</FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell:</B> <EM>What is the BioPerl project?</EM></FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon:</B><SPAN> You would write your own parsers for the data formats, but instead you could reuse the parsers other people have written.&nbsp; The collection of parsers and other utilities to manipulate data (eg. reverse complement DNA sequence) make up the BioPerl project.&nbsp; </SPAN></FONT></P>
    <P><SPAN><FONT face=Calibri></FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell:</B><SPAN>&nbsp; <EM>What are some of the advantages of Perl over other languages, such as Python?</EM></SPAN></FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon</B><SPAN>: One advantages is there are more example scripts and packages like BioPerl so you don&rsquo;t have to write everything from scratch.&nbsp; Additionally, Perl is a little more forgiving in its syntax.</SPAN></FONT><SPAN><FONT face=Calibri>&nbsp;</FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell</B>: <EM>What kinds of tasks can Perl do in genetics research that make it invaluable?</EM></FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon</B><SPAN>: One of the most powerful characteristics of Perl is its ability to use regular expressions. If you can master regular expressions, it is very easy to extract data from any file format.&nbsp; Regular expressions are a text search syntax that lets you specify format rather than literal values.&nbsp; So, instead of searching for a specific number, such as 13 or 14, you could say, &ldquo;Find me more than one digit.&rdquo;&nbsp; </SPAN></FONT></P>
    <P><SPAN><FONT face=Calibri></FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell</B>:&nbsp; <EM>What resources would you recommend for scientists to learn Perl?</EM></FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon:</B><SPAN> There is a book in the O&rsquo;Reilley series by James Tisdall called &ldquo;Beginning Perl for Bioinformatics&rdquo;.&nbsp; As a forehand, the Applied Computational Genomics Course (<A href="http://www.gcbioinformatics.ca/training"><SPAN style="TEXT-DECORATION: underline"><FONT color=#0000ff>www.gcbioinformatics.ca/training</FONT></SPAN></A>) is a good introduction to Perl and Regular Expressions .&nbsp; Self-starters could look at many materials on the Perl website (<A href="http://learn.perl.org">http://learn.perl.org</A>).</SPAN></FONT><SPAN><FONT face=Calibri>&nbsp;</FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell:</B>&nbsp; <EM>What resources would you recommend for accessing sample Perl scripts?</EM></FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon: </B><SPAN>&nbsp;Just googling for the key words for the tasks should return sample scripts.&nbsp; Many people post their scripts to mailing lists or code repositories.</SPAN></FONT></P>
    <P><SPAN><FONT face=Calibri></FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell</B>:&nbsp; <EM>What role does Perl play in bioinformatics open source software packages?</EM> </FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon:</B><SPAN>&nbsp; Perl is one of the top three languages in which bioinformatics open source software is written, and it is probably the easiest of the three languages to learn.&nbsp; Java and C are significantly more difficult for a beginner, and additionally many C software projects are accessible via Perl modules anyways.</SPAN></FONT><SPAN><FONT face=Calibri>&nbsp;</FONT></SPAN></P>
    <P><FONT face=Calibri><B>Susanne Cardwell: </B><SPAN>&nbsp;<EM>Is Perl the first language a biologist should learn?</EM></SPAN></FONT></P>
    <P><FONT face=Calibri><B>Paul Gordon</B><SPAN>: Yes, simply because you can get something useful working faster with Perl than with the other languages.&nbsp; If your goal is simply presenting information on a website rather than number crunching, PHP and Python are good too.</SPAN></FONT></P>]]></description><pubDate>Tue, 12 Oct 10 16:15:00 UT</pubDate></item><item><title>Interview with Paul Gordon on Semantic Web Technologies</title><link>http://genomealberta.ca/blogs/interview-with-paul-gordon-on-semantic-web-technologies.aspx</link><description><![CDATA[<BR>
    <EM>guest post from Susanne Cardwell <BR>
    Administrative Coordinator&nbsp;<BR>
    </EM><A href="http://www.gcbioinformatics.ca/training" target=_blank><STRONG><EM>Bioinformatics Platform Applied Computational Genomics Course</EM></STRONG></A><BR>
    <BR>
    &nbsp; Paul Gordon, the Bionformatics specialist for the Sun Center of Excellence for Visual Genomics,&nbsp;gave the following description of Semantic Web Technologies and how they relate to the programs he is developing called Daggoo and Seahawk:<BR>
    <BR>
    &nbsp; &#8220;In a nutshell,&#8221; says Paul Gordon, &#8220;Semantic Web technologies are about using URLs instead of words to refer to concepts.&#8221; He says that the advantage is that URLs (i.e., Web addresses like http://...) are unambiguous &#8211; it&#8217;s easier for computers to use URLs as computers have historically had problems with interpreting natural language. He states that the reason you want to use URLs in this capacity is so that the computer can surf the web for you instead of you manually trying to find answers on the web. &#8220;In short, it is about having a web of data instead of a web of documents,&#8221; says Gordon. One major problem is how to shoehorn the current Web into this Semantic model, and this is his primary focus.&nbsp;<BR>]]></description><pubDate>Tue, 16 Mar 10 16:15:00 UT</pubDate></item><item><title>Genomics at BioHackathon 2010</title><link>http://genomealberta.ca/blogs/genomics-at-biohackathon-2010.aspx</link><description><![CDATA[<EM>guest post from Susanne Cardwell<BR>
    Administrative Coordinator<BR>
    </EM><A href="http:// www.gcbioinformatics.ca/training" target=_blank><STRONG><EM>Bioinformatics Platform Applied Computational Genomics Course</EM></STRONG></A><BR>
    <BR>
    <BR>
    Paul Gordon, bioinformatics specialist with the Sun Center of Excellence for Visual Genomics, recently attended the Hackathon 2010, at the University of Tokyo. The objective of the Hackathon 2010 was to define technologies and standards for the global life sciences community for the next generation of web technologies often call the Semantic Web. <BR>
    <BR>
    There were approximately 40-50 participants, and&nbsp;Paul was representing the Genome Canada Bioinformatics Platform. Several Canadians in attendance were some of the pioneers in Semantic Web Technologies for the Life Sciences.&nbsp;<BR>
    <BR>
    Bio Hackathon 2010 was about creating a critical mass of data providers providing the same format of information for the purpose of standardization. Standardization allows for people to ask queries or questions that require information from multiple databases. A major aim of the meeting was to educate the developers on some of the semantic web technologies, which include RDF, Sparql, and Semantic Web Services.&nbsp;Paul Gordon and others enlightened the participants about these technologies, how to create queries using these technologies, and use case development. <BR>
    <BR>
    Another major aim of the meeting was the writing of computer code. Gordon focused on making it easier to use these technologies over existing databases. <BR>
    <BR>
    <BR>
    <BR>
    <img src="http://genomealberta.ca/files/Images/blogs/BioHackathon_Group_Pic_Resized.jpg" style="VERTICAL-ALIGN: bottom" alt="BioHackathon Group Photo" />]]></description><pubDate>Mon, 01 Mar 10 03:45:00 UT</pubDate></item>
</channel>
</rss>
