Tableau is a Business Intelligence tool for visually analysing the data. Users can create and distribute interactive and shareable dashboards which depict the trends, variations and density of the data in form of graphs and charts. Tableau can connect to files, relational and Big data sources to acquire and process data. The software allows data blending and real time collaboration, which makes it very unique. It is used by businesses, academic researchers and many governments to do visual data analysis. It is also positioned as a leader Business Intelligence and Analytics Platform in Gartner Magic Quadrant.
Monday, 27 February 2017
Sunday, 12 February 2017
Visualising with Seaborn
Seaborn is a visualization library for making attractive and informative statistical graphics in Python. It is built on top of matplotlib and tightly integrated with the PyData stack, including support for numpy and panda data structures and statistical routines from SciPy and statsmodels.
Violinplot:
Jointplot:
Implot:
Pairplot:
Heatmap:
Clustermap:
Tuesday, 7 February 2017
Connecting to Your Data with Tableau
In this post, we will explore some data from the General Social Survey. The General Social Survey is an NSF-funded survey, interviewing more than 50,000 Americans over nearly 3 decades. Survey questions cover topics ranging from political, to economic, to educational and more. We will be taking premarital sex into consideration.
Thursday, 2 February 2017
Playing Around with Google Charts
In this post, I'll be trying my hand at creating a basic Google Chart.
I'm plotting a pie chart depicting the toppings that I like to have on a pizza.
googleVis - Google Charts & R
Data: StateMilk • Chart ID: GeoChartID216b4a75f4f6 • googleVis-0.6.2
R version 3.3.1 (2016-06-21) • Google Terms of Use • Documentation and Data Policy
R version 3.3.1 (2016-06-21) • Google Terms of Use • Documentation and Data Policy
Monday, 23 January 2017
Thursday, 12 January 2017
Creating Charts and Maps with ggplot2
Data Visualization with R
In this article, I will work on the following visualizations:
Rudimentary Visualizations:
- Scatter plot
- Histogram
- Line Plot
- Bar chart
- Box plot
Advanced Visualizations:
- Area Chart
- Heat Map
- Correlogram
- Hexbin Plot
- Mosaic Map
- Map Visualization
- 3D Graphs
I will work on the 'Big Mart data' data sets as well as some of the small data sets provided in the HistData package.
#load the Big Mart data set
data=read.csv("/Users/Srinka_MAC/Documents/Praxis/Term3/DVL/BigMartDataset.csv")
train = data
Rudimentary Visualizations:
Scatter Plot
library(ggplot2)
ggplot(train, aes(Item_Visibility, Item_MRP)) + geom_point() + scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+ scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+ theme_bw()
ggplot(train, aes(Item_Visibility, Item_MRP)) + geom_point(aes(color = Item_Type)) +
scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+
scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+
theme_bw() + labs(title="Scatterplot")
ggplot(train, aes(Item_Visibility, Item_MRP)) + geom_point(aes(color = Item_Type)) +
scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+
scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+
theme_bw() + labs(title="Scatterplot") + facet_wrap( ~ Item_Type)
Histogram
library(HistData)
library(RColorBrewer)
data(VADeaths)
par(mfrow=c(2,3))
hist(VADeaths,breaks=10, col=brewer.pal(3,"Set3"),main="Set3 3 colors")
hist(VADeaths,breaks=3 ,col=brewer.pal(3,"Set2"),main="Set2 3 colors")
hist(VADeaths,breaks=7, col=brewer.pal(3,"Set1"),main="Set1 3 colors")
hist(VADeaths,,breaks= 2, col=brewer.pal(8,"Set3"),main="Set3 8 colors")
hist(VADeaths,col=brewer.pal(8,"Greys"),main="Greys 8 colors")
hist(VADeaths,col=brewer.pal(8,"Greens"),main="Greens 8 colors")
Line Plot
plot(AirPassengers,type="l")
Bar Chart
ggplot(train, aes(Outlet_Establishment_Year)) + geom_bar(fill = "red")+theme_bw()+
scale_x_continuous("Establishment Year", breaks = seq(1985,2010)) +
scale_y_continuous("Count", breaks = seq(0,1500,150)) +
coord_flip()+ labs(title = "Bar Chart") + theme_gray()
Vertical Bar Chart
ggplot(train, aes(Item_Type, Item_Weight)) + geom_bar(stat = "identity", fill = "darkblue") +
scale_x_discrete("Outlet Type")+
scale_y_continuous("Item Weight", breaks = seq(0,15000, by = 500))+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) + labs(title = "Bar Chart")
Stacked Bar chart
ggplot(train, aes(Outlet_Location_Type, fill = Outlet_Type)) + geom_bar()+
labs(title = "Stacked Bar Chart", x = "Outlet Location Type", y = "Count of Outlets")
Box plot
ggplot(train, aes(Outlet_Identifier, Item_Outlet_Sales)) + geom_boxplot(fill = "red")+
scale_y_continuous("Item Outlet Sales", breaks= seq(0,15000, by=500))+
labs(title = "Box Plot", x = "Outlet Identifier")
Advanced Visualizations:
Area Chart
ggplot(train, aes(Item_Outlet_Sales)) + geom_area(stat = "bin", bins = 30, fill = "steelblue") +
scale_x_continuous(breaks = seq(0,11000,1000))+ labs(title = "Area Chart",
x = "Item Outlet Sales", y = "Count")
Heat Map
ggplot(train, aes(Outlet_Identifier, Item_Type))+geom_raster(aes(fill = Item_MRP))+
labs(title ="Heat Map", x = "Outlet Identifier", y = "Item Type")+
scale_fill_continuous(name = "Item MRP")
Correlogram
install.packages("corrgram")
library(corrgram)
corrgram(train, order=NULL, panel=panel.shade, text.panel=panel.txt,
main="Correlogram")
Hexbin plot
library(hexbin)
a=hexbin(diamonds$price,diamonds$carat,xbins=40)
library(RColorBrewer)
plot(a)
library(RColorBrewer)
rf <- colorRampPalette(rev(brewer.pal(40,'Set3')))
hexbinplot(diamonds$price~diamonds$carat, data=diamonds, colramp=rf)
rf <- colorRampPalette(rev(brewer.pal(40,'Set3')))
hexbinplot(diamonds$price~diamonds$carat, data=diamonds, colramp=rf)
Mosaic Plot
data(HairEyeColor)
mosaicplot(HairEyeColor)
Map Visualization
devtools::install_github("rstudio/leaflet")
library(magrittr)
library(leaflet)
m <- leaflet() %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addMarkers(lng=88.3630400, lat=22.5626300)
m # Print the map
3D Graph
library(Rcmdr)
data(iris, package="datasets")
scatter3d(Petal.Width~Petal.Length+Sepal.Length|Species, data=iris, fit="linear", residuals=TRUE, parallel=FALSE, bg="white", axis.scales=TRUE, grid=TRUE, ellipsoid=FALSE)
library(lattice)
attach(iris)# 3d scatterplot by factor level
cloud(Sepal.Length~Sepal.Width*Petal.Length|Species, main="3D Scatterplot by Species")
xyplot(Sepal.Width ~ Sepal.Length, iris, groups = iris$Species, pch= 20)
Wednesday, 11 January 2017
Working with GGPlot
What is the Grammar of Graphics?
The basic idea behind it is to independently specify plot building blocks and combine them to create just about any kind of graphical display you want.
GGPlot2
The ggplot2 is a package that offers a powerful graphics
language for creating elegant and complex plots. Originally based on Leland
Wilkinson's The Grammar of Graphics, ggplot2 allows you to create graphs that
represent both univariate and multivariate numerical and categorical data in a
straightforward manner.
#install & load ggplot library
install.package("ggplot2")
library("ggplot2")
### show info about the data
head(diamonds)
head(mtcars)
### comparison qplot vs ggplot
# qplot histogram
qplot(clarity, data=diamonds, fill=cut, geom="bar")
### how to use qplot
# scatterplot
qplot(wt, mpg, data=mtcars)
# transform input data with functions
qplot(log(wt), mpg - 10, data=mtcars)
# add aesthetic mapping (hint: how does mapping work)qplot(wt, mpg, data=mtcars, color=qsec)
# change size of points (hint: color/colour, hint: set aesthetic/mapping)
qplot(wt, mpg, data=mtcars, color=qsec, size=3)
#use alpha blending
qplot(wt, mpg, data=mtcars, alpha=qsec)
#continuous scale vs. discrete scale
head(mtcars)
qplot(wt, mpg, data=mtcars, colour=cyl)
levels(mtcars$cyl)
qplot(wt, mpg, data=mtcars, colour=factor(cyl))
# combine mappings (hint: hollow points, geom-concept, legend combination)
qplot(wt, mpg, data=mtcars, size=qsec, color=factor(carb))
qplot(wt, mpg, data=mtcars, size=qsec, color=factor(carb), shape=I(1))
qplot(wt, mpg, data=mtcars, size=qsec, shape=factor(cyl), geom="point")
qplot(wt, mpg, data=mtcars, size=factor(cyl), geom="point")
# bar-plot
qplot(factor(cyl), data=mtcars, geom="bar")
# flip plot by 90°
qplot(factor(cyl), data=mtcars, geom="bar") + coord_flip()
# difference between fill/color bars
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(cyl))
qplot(factor(cyl), data=mtcars, geom="bar", colour=factor(cyl))
# fill by variable
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(gear))
# use different display of bars (stacked, dodged, identity)
head(diamonds)
qplot(clarity, data=diamonds, geom="bar", fill=cut, position="stack")
# histogram
qplot(carat, data=diamonds, geom="histogram")
# change binwidth
qplot(carat, data=diamonds, geom="histogram", binwidth=0.01)
# use geom to combine plots (hint: order of layers)
qplot(wt, mpg, data=mtcars, geom=c("point", "smooth"))
qplot(wt, mpg, data=mtcars, color=factor(cyl), geom=c("point", "smooth"))
# using ggplot-syntax with qplot (hint: qplot creates layers automatically)
qplot(mpg, wt, data=mtcars, color=factor(cyl), geom="point") + geom_line()
# add an additional layer with different mapping
p.tmp + geom_point()
p.tmp + geom_point() + geom_point(aes(y=disp))
# dealing with overplotting (hollow points, pixel points, alpha[0-1] )
t.df <- data.frame(x=rnorm(2000), y=rnorm(2000))
p.norm <- ggplot(t.df, aes(x,y))
p.norm + geom_point()
p.norm + geom_point(shape=1)
p.norm + geom_point(shape=".")
p.norm + geom_point(colour=alpha("blue", 1/10))
# using scales (color palettes, manual colors, matching of colors to values)
p.tmp <- qplot(cut, data=diamonds, geom="bar", fill=cut)
p.tmp
RColorBrewer::display.brewer.all()
p.tmp + scale_fill_manual(values=c("#7fc6bc","#083642","#b1df01","#cdef9c","#466b5d"))
### create a pie-chart, radar-chart (hint: not recommended)
# map a barchart to a polar coordinate system
p.tmp <- ggplot(mtcars, aes(x=factor(1), fill=factor(cyl))) + geom_bar(width=1)
p.tmp
p.tmp + coord_polar(theta="y")
p.tmp + coord_polar()
ggplot(mtcars, aes(factor(cyl), fill=factor(cyl))) + geom_bar(width=1) + coord_polar()
The ggplot2 is a package that offers a powerful graphics
language for creating elegant and complex plots. Originally based on Leland
Wilkinson's The Grammar of Graphics, ggplot2 allows you to create graphs that
represent both univariate and multivariate numerical and categorical data in a
straightforward manner.
#install & load ggplot library
install.package("ggplot2")
library("ggplot2")
### show info about the data
head(diamonds)
head(mtcars)
### comparison qplot vs ggplot
# qplot histogram
qplot(clarity, data=diamonds, fill=cut, geom="bar")
### how to use qplot
# scatterplot
qplot(wt, mpg, data=mtcars)
# transform input data with functions
qplot(log(wt), mpg - 10, data=mtcars)
# add aesthetic mapping (hint: how does mapping work)qplot(wt, mpg, data=mtcars, color=qsec)
# change size of points (hint: color/colour, hint: set aesthetic/mapping)
qplot(wt, mpg, data=mtcars, color=qsec, size=3)
#use alpha blending
qplot(wt, mpg, data=mtcars, alpha=qsec)
#continuous scale vs. discrete scale
head(mtcars)
qplot(wt, mpg, data=mtcars, colour=cyl)
levels(mtcars$cyl)
qplot(wt, mpg, data=mtcars, colour=factor(cyl))
# combine mappings (hint: hollow points, geom-concept, legend combination)
qplot(wt, mpg, data=mtcars, size=qsec, color=factor(carb))
qplot(wt, mpg, data=mtcars, size=qsec, color=factor(carb), shape=I(1))
qplot(wt, mpg, data=mtcars, size=qsec, shape=factor(cyl), geom="point")
qplot(wt, mpg, data=mtcars, size=factor(cyl), geom="point")
# bar-plot
qplot(factor(cyl), data=mtcars, geom="bar")
# flip plot by 90°
qplot(factor(cyl), data=mtcars, geom="bar") + coord_flip()
# difference between fill/color bars
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(cyl))
qplot(factor(cyl), data=mtcars, geom="bar", colour=factor(cyl))
# fill by variable
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(gear))
# use different display of bars (stacked, dodged, identity)
head(diamonds)
qplot(clarity, data=diamonds, geom="bar", fill=cut, position="stack")
# histogram
qplot(carat, data=diamonds, geom="histogram")
# change binwidth
qplot(carat, data=diamonds, geom="histogram", binwidth=0.01)
# use geom to combine plots (hint: order of layers)
qplot(wt, mpg, data=mtcars, geom=c("point", "smooth"))
qplot(wt, mpg, data=mtcars, color=factor(cyl), geom=c("point", "smooth"))
# using ggplot-syntax with qplot (hint: qplot creates layers automatically)
qplot(mpg, wt, data=mtcars, color=factor(cyl), geom="point") + geom_line()
# add an additional layer with different mapping
p.tmp + geom_point()
p.tmp + geom_point() + geom_point(aes(y=disp))
# dealing with overplotting (hollow points, pixel points, alpha[0-1] )
t.df <- data.frame(x=rnorm(2000), y=rnorm(2000))
p.norm <- ggplot(t.df, aes(x,y))
p.norm + geom_point()
p.norm + geom_point(shape=1)
p.norm + geom_point(shape=".")
p.norm + geom_point(colour=alpha("blue", 1/10))
# using scales (color palettes, manual colors, matching of colors to values)
p.tmp <- qplot(cut, data=diamonds, geom="bar", fill=cut)
p.tmp
RColorBrewer::display.brewer.all()
p.tmp + scale_fill_manual(values=c("#7fc6bc","#083642","#b1df01","#cdef9c","#466b5d"))
### create a pie-chart, radar-chart (hint: not recommended)
# map a barchart to a polar coordinate system
p.tmp <- ggplot(mtcars, aes(x=factor(1), fill=factor(cyl))) + geom_bar(width=1)
p.tmp
p.tmp + coord_polar(theta="y")
p.tmp + coord_polar()
ggplot(mtcars, aes(factor(cyl), fill=factor(cyl))) + geom_bar(width=1) + coord_polar()
Subscribe to:
Posts (Atom)