Events

Invalid Start Date
start 1017801215
Refresh calendars Add to google calendar
November,2024
25 Nov 1:00 pm 4:00 pm

Intro to Linux Command Line

Working with many of the HPC systems (like those at SciNet) involves using the Linux/UNIX command line. This provides a very powerful interface, but it can be quite daunting for the uninitiated. In this half-day session, you can become initiated with this course which will cover basic commands. It could be a great boon for your productivity.Format: Virtual Virtual
SCMP101 - Nov 2024Show in Google map
26 Nov 1:00 pm 2:00 pm

Intro to Programming with Python

New to programming? Learn the basics of programming using python in eight one-hour sessions over the course of four weeks. Sessions will consist of a mix of lectures and hands-on exercises.Format: In-person. Sessions will be recorded. SciNet Teaching Room
SCMP142 - Nov 2024Show in Google map
28 Nov 1:00 pm 2:00 pm

Intro to Programming with Python

New to programming? Learn the basics of programming using python in eight one-hour sessions over the course of four weeks. Sessions will consist of a mix of lectures and hands-on exercises.Format: In-person. Sessions will be recorded. SciNet Teaching Room
SCMP142 - Nov 2024Show in Google map
28 Nov 11:59 pm

Assignmen 10 - new is due

Due date: November 28th at midnight (Thursday night).



In this assignment we will perform a clustering analysis on some codon data. Codons, for those unfamiliar, are sequences of three nucleotides that form a unit of genetic information. Since there are 4 nucleotide bases (A, C, T, G), there are 4*4*4=64 possible codons. You can read more about codons here.

The data for this assignment were originally taken from here, though the original source is here. I have modified the data to simplify it somewhat; the modified data can be found here. The data consists of codon relative-frequency data for 3000 species of organisms. We will perform a clustering analysis on these data to determine if there are any clusters of significance and which 'Kingdoms' (types of organisms), are most-commonly in those clusters.



0) You must use version control ("git"), as you develop your code. We suggest you start, from the Linux command line, by creating a new directory, e.g. assignment10, cd into that directory and initialize a git repository ("git init") within it, and perform "git add ..., git commit" repeatedly as you add to your code. You will hand in the output of "git log" for your assignment repository as part of the assignment. You must have a significant number of commits representing the modifications, alterations and changes to your code. If your log does not show a significant number of commits with meaningful comments you will lose marks.



1) Create a file named Codon.Utilities.R containing the following functions.

1a) Write a function which takes a single string argument, the name of the data file. The function should load the data, keep only the features (the codon columns) and the target (the 'Kingdom' column), and return the resulting data frame. You may hard-code the column names involved.

1b) Write a function which, given some input data and a percentage, will perform a principle component analysis. The function should keep only those principle components whose standard deviations are greater than or equal to the standard deviation of the first principle component, multiplied by the second argument (see the tol argument to the prcomp function). The function should then return the model.

1c) Write a function which, given some input data, will perform 10-fold cross validation on \\(k\\)-Means models, with \\(k\\) values ranging from 1 to 15. The function should generate a plot of cross-validation score versus \\(k\\). You may copy the cross-validation function given in lecture.

1d) Write a function which takes as arguments a trained \\(k\\)-means model and a vector of targets. The function should determine the Kingdom with the highest population in each cluster. The function should print out the cluster number, the name of the Kingdom and the percentage of the population of the cluster that it represents.



2) Create a script called Codon.Analysis.R that will perform the following steps:

take a single command-line argument, the data to be processed,
load the modified data file, linked above,
run PCA on the features, keeping those principle components whose standard deviations are at least 20% of the first principle component's standard deviation.
create a transformed version of the original features, using the principle components from the PCA. The predict function works to do this, in the usual way.
run 10-fold cross-validation on \\(k\\)-means models, using the transformed data, to determine which value of \\(k\\) would be the ideal number of clusters for this data set.
At this point you should examine the figure which is generated by the function. Using your expertise, pick an ideal value of \\(k\\) for this data set. Put an explanation of your choice in the comments of your driver script.
Create a new \\(k\\)-Means model, using your choice of ideal value of \\(k\\) and the transformed data.
Print out the Kingdom in each cluster of your new model which has the largest population, and its percentage of the cluster's total population.


Your script should output something like this, when run from the shell terminal:


$
$ Rscript Codon.Analysis.R codon_usage_filtered_small.csv
Cluster: 1 , Kingdom: Plant , Percent: 37.25136 %
Cluster: 2 , Kingdom: Virus , Percent: 45.86466 %
Cluster: 3 , Kingdom: Bacteria , Percent: 83.53659 %
Cluster: 4 , Kingdom: Plant , Percent: 23.86895 %
Cluster: 5 , Kingdom: Vertebrate , Percent: 68.92139 %
$

Note that warning messages may be generated by your code. These may be ignored. Be sure to comment and document your functions. Defensive programming is needed for the number of driver-script arguments and the presense of the data file.




Submit your Codon.Utilities.R, Codon.Analysis.R files, your cross-validation figure, and the output of git log from your assignment repository.


Both R code files must be added and committed frequently to the repository. To capture the output of git log use redirection (git log > git.log, and hand in the git.log file). 



Assignments will be graded on a 10 point basis. Due date is November 28th 2024 (midnight), with 0.5 penalty point per day off for late submission until the cut-off date of December 5th 2024, at 10:00am.

MSC1090 - Fall 2024
29 Nov 1:00 pm 4:00 pm

Intro to Apptainer

Container computing is gradually changing the way researchers are developing, sharing, and running software applications. Apptainer (formerly called Singularity) is gaining popularity in HPC for its performance, ease of use, portability,  and security. In this course, we will explore: what is a container, why use a container, and how to use and create one.Format: Virtual
SCMP161 - Nov 2024
December,2024
2 Dec 12:30 pm 2:00 pm

Intro to parallel programming, session 1/3

An introduction to concepts and techniques in parallel computing with compiled languages, e.g., C, C++ or Fortran. Both OpenMP and MPI will be introduced.Format: Virtual Virtual
HPC163 - Dec 2024Show in Google map
3 Dec 1:00 pm 2:00 pm

Intro to Programming with Python

New to programming? Learn the basics of programming using python in eight one-hour sessions over the course of four weeks. Sessions will consist of a mix of lectures and hands-on exercises.Format: In-person. Sessions will be recorded. SciNet Teaching Room
SCMP142 - Nov 2024Show in Google map
4 Dec 12:30 pm 2:00 pm

Intro to parallel programming, session 2/3

An introduction to concepts and techniques in parallel computing with compiled languages, e.g., C, C++ or Fortran. Both OpenMP and MPI will be introduced.Format: Virtual
HPC163 - Dec 2024
5 Dec 1:00 pm 2:00 pm

Intro to Programming with Python

New to programming? Learn the basics of programming using python in eight one-hour sessions over the course of four weeks. Sessions will consist of a mix of lectures and hands-on exercises.Format: In-person. Sessions will be recorded. SciNet Teaching Room
SCMP142 - Nov 2024Show in Google map
6 Dec 12:30 pm 2:00 pm

Intro to parallel programming, session 3/3

An introduction to concepts and techniques in parallel computing with compiled languages, e.g., C, C++ or Fortran. Both OpenMP and MPI will be introduced.Format: Virtual
HPC163 - Dec 2024
January,2025
7 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
9 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
14 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
16 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
20 Jan 1:00 pm 4:00 pm

Linux Shell Scripting

Learn how to write bash scripts, use environment variables, how to control process, and much more. Requires some Linux basic command line experience.Format: Virtual Virtual
SCMP201 - Jan 2025Show in Google map
21 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
23 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
28 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
30 Jan 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
February,2025
3 Feb 1:00 pm 4:00 pm

Common bash command line idioms

This workshop explores various concise and useful constructs for working with bash shell. The goal is to improve your shell skills. Attending this class requires some basic GNU/Linux command line experience.Format: VirtualTime:  1:00 pm - 4:00 pm EST SciNet Teaching Room
SCMP281 - Feb 2025Show in Google map
4 Feb 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
6 Feb 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
11 Feb 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
13 Feb 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
25 Feb 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
27 Feb 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
March,2025
3 Mar 1:00 pm 4:00 pm

Intro to Linux Command Line

Working with many of the HPC systems (like those at SciNet) involves using the Linux/UNIX command line. This provides a very powerful interface, but it can be quite daunting for the uninitiated. In this half-day session, you can become initiated with this course which will cover basic commands. It could be a great boon for your productivity.Format: Virtual Virtual
SCMP101 - Mar 2025Show in Google map
4 Mar 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
6 Mar 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025
11 Mar 11:00 am 12:00 pm

PHY1610 Scientific Computing Lecture

This course is aimed at reducing your struggle in getting started with computational projects, and make you a more efficient computational scientist. Topics include well-established best practices for developing software as it applies to scientific computations, common numerical techniques and packages, and aspects of high performance computing. While we will introduce the C++ language, in one language or another, students should already have some programming experience. Despite the title, this course is suitable for many physical scientists (chemists, astronomers, ...).This is a graduate course that can be taken for graduate credit by UofT PhD and MSc students. Students that wish to do so, should enrol using ACORN/ROSI.This is an in-person course.
PHY1610 - Winter 2025