Jirka's Public Notepad

Data Engineering | Python | SQL Server | Teradata

July 8, 2016 By Jiří Hubáček Leave a Comment

Connecting from R to Teradata database

I’ve been taking courses from Coursera’s Data Science specialization for the last three months already. All the tasks are being performed using R and its contemporary packages like dplyr, lubridate, ggplot2, etc. I’ve gained really lots of hands-on experience throughout the courses – and in a very engaging way.

In my project, I needed to connect from R to Teradata to visualize data quality metrics and their trends. What could be better than doing this directly in R, right? 🙂 Reasons:

  • I’m learning R :).
  • Can start right away.
  • R and R Studio are free, so no need to take weeks to acquire a license in the corporate world (depending on who you know of course) takes weeks.
  • High usability through scriptability. Profiling a different table is a matter of changing a parameter.
  • Repeatable – I can produce the same output multiple times.

So I when started, first I needed to figure out how to connect to Teradata – which is what this blog post is about.

There’s a package for connecting to Teradata database made by the company. The package is available on the Teradata’s github page. We can download it using a browser or (the preferred way) fetch it directly from R using:

download.file("https://github.com/Teradata/teradataR/raw/master/build/teradataR_1.1.0.tar.gz",
              destfile = "teradataR_1.1.0.tar.gz")

Then we need to install it along with RODBC package which is available through the CRAN network.

install.packages("c:/temp/teradataR_1.1.0.tar.gz", repos=NULL, type="source")
install.packages("RODBC")

Finally, we can load the packages using library command, connect to the Teradata engine and test the connection by querying the sys_caledar table.

library(teradataR)
library(RODBC)

tdConnect("DSN_NAME_HERE",uid = "USER_NAME_HERE", pwd = "PASSWORD_HERE")

df <- tdQuery("select * from sys_calendar.calendar sample 100;")
head(df)

tdClose()

Be nice and close the connection when you’re done 🙂

Related

Filed Under: Uncategorized

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  • GitHub
  • LinkedIn
  • RSS
  • Twitter
© 2022 · Jiří Hubáček, PGP