Monday, 27 February 2017

Taking Visualization a Step Further with Tableau

Tableau is a Business Intelligence tool for visually analysing the data. Users can create and distribute interactive and shareable dashboards which depict the trends, variations and density of the data in form of graphs and charts. Tableau can connect to files, relational and Big data sources to acquire and process data. The software allows data blending and real time collaboration, which makes it very unique. It is used by businesses, academic researchers and many governments to do visual data analysis. It is also positioned as a leader Business Intelligence and Analytics Platform in Gartner Magic Quadrant.


Sunday, 12 February 2017

Visualising with Seaborn

Seaborn is a visualization library for making attractive and informative statistical graphics in Python. It is built on top of matplotlib and tightly integrated with the PyData stack, including support for numpy and panda data structures and statistical routines from SciPy and statsmodels.

HTML Iframes

Violinplot:



HTML Iframes

Jointplot:



HTML Iframes

Implot:



HTML Iframes

Pairplot:



HTML Iframes

Heatmap:



HTML Iframes

Clustermap:



Tuesday, 7 February 2017

Connecting to Your Data with Tableau

In this post, we will explore some data from the General Social Survey. The General Social Survey is an NSF-funded survey, interviewing more than 50,000 Americans over nearly 3 decades. Survey questions cover topics ranging from political, to economic, to educational and more. We will be taking premarital sex into consideration.





Thursday, 2 February 2017

Playing Around with Google Charts

In this post, I'll be trying my hand at creating a basic Google Chart.

I'm plotting a pie chart depicting the toppings that I like to have on a pizza.

googleVis - Google Charts & R

GeoChartID216b4a75f4f6

The googleVis package provides an interface between R and the Google's charts tools. It allows users to create web pages with interactive charts based on R data frames. The resultant charts are displayed locally via the R HTTP help server and have to be viewed on a browser.

I'm working with this spreadsheet that has some data on milk production in Indian states.

Data: StateMilk • Chart ID: GeoChartID216b4a75f4f6googleVis-0.6.2
R version 3.3.1 (2016-06-21) • Google Terms of UseDocumentation and Data Policy

Monday, 23 January 2017

Plotting a Bubble Chart Using Javascript

Let us try our hand at plotting a bubble chart.

Geomapping - Plotting Specific Addresses on a Map

Let us try and plot a Specific Address now

Plotting a Circle Wave Using Javascript

Plotting Circle Wave

And it's done!

Geomapping - Plotting Cities on a Map

Plotting Cities on a Map

Voila!

Thursday, 12 January 2017

Creating Charts and Maps with ggplot2

Data Visualization with R

In this article, I will work on the following visualizations:

Rudimentary Visualizations:

  1. Scatter plot
  2. Histogram
  3. Line Plot
  4. Bar chart
  5. Box plot

Advanced Visualizations:

  1. Area Chart
  2. Heat Map
  3. Correlogram
  4. Hexbin Plot
  5. Mosaic Map
  6. Map Visualization
  7. 3D Graphs

I will work on the 'Big Mart data' data sets as well as some of the small data sets provided in the HistData package.


#load the Big Mart data set
data=read.csv("/Users/Srinka_MAC/Documents/Praxis/Term3/DVL/BigMartDataset.csv")
train = data



Rudimentary Visualizations:

Scatter Plot

library(ggplot2)          

ggplot(train, aes(Item_Visibility, Item_MRP)) + geom_point() + scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+ scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+ theme_bw() 






ggplot(train, aes(Item_Visibility, Item_MRP)) + geom_point(aes(color = Item_Type)) + 
  scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+
  scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+
  theme_bw() + labs(title="Scatterplot")






ggplot(train, aes(Item_Visibility, Item_MRP)) + geom_point(aes(color = Item_Type)) + 
  scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+
  scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+ 

  theme_bw() + labs(title="Scatterplot") + facet_wrap( ~ Item_Type)



















Histogram

library(HistData)
library(RColorBrewer)

data(VADeaths)
par(mfrow=c(2,3))
hist(VADeaths,breaks=10, col=brewer.pal(3,"Set3"),main="Set3 3 colors")
hist(VADeaths,breaks=3 ,col=brewer.pal(3,"Set2"),main="Set2 3 colors")
hist(VADeaths,breaks=7, col=brewer.pal(3,"Set1"),main="Set1 3 colors")
hist(VADeaths,,breaks= 2, col=brewer.pal(8,"Set3"),main="Set3 8 colors")
hist(VADeaths,col=brewer.pal(8,"Greys"),main="Greys 8 colors")
hist(VADeaths,col=brewer.pal(8,"Greens"),main="Greens 8 colors")

















Line Plot

plot(AirPassengers,type="l")



Bar Chart

ggplot(train, aes(Outlet_Establishment_Year)) + geom_bar(fill = "red")+theme_bw()+
  scale_x_continuous("Establishment Year", breaks = seq(1985,2010)) + 
  scale_y_continuous("Count", breaks = seq(0,1500,150)) +
  coord_flip()+ labs(title = "Bar Chart") + theme_gray()





Vertical Bar Chart

ggplot(train, aes(Item_Type, Item_Weight)) + geom_bar(stat = "identity", fill = "darkblue") + 
  scale_x_discrete("Outlet Type")+ 
  scale_y_continuous("Item Weight", breaks = seq(0,15000, by = 500))+ 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) + labs(title = "Bar Chart") 





Stacked Bar chart

ggplot(train, aes(Outlet_Location_Type, fill = Outlet_Type)) + geom_bar()+
  labs(title = "Stacked Bar Chart", x = "Outlet Location Type", y = "Count of Outlets")




Box plot

ggplot(train, aes(Outlet_Identifier, Item_Outlet_Sales)) + geom_boxplot(fill = "red")+
  scale_y_continuous("Item Outlet Sales", breaks= seq(0,15000, by=500))+
  labs(title = "Box Plot", x = "Outlet Identifier")



Advanced Visualizations:


Area Chart

ggplot(train, aes(Item_Outlet_Sales)) + geom_area(stat = "bin", bins = 30, fill = "steelblue") + 
  scale_x_continuous(breaks = seq(0,11000,1000))+ labs(title = "Area Chart", 
                                                       x = "Item Outlet Sales", y = "Count") 



Heat Map

ggplot(train, aes(Outlet_Identifier, Item_Type))+geom_raster(aes(fill = Item_MRP))+
 labs(title ="Heat Map", x = "Outlet Identifier", y = "Item Type")+
scale_fill_continuous(name = "Item MRP") 



Correlogram

install.packages("corrgram")
library(corrgram)

corrgram(train, order=NULL, panel=panel.shade, text.panel=panel.txt,
         main="Correlogram") 



Hexbin plot

library(hexbin)

a=hexbin(diamonds$price,diamonds$carat,xbins=40)
library(RColorBrewer)
plot(a)




library(RColorBrewer)
rf <- colorRampPalette(rev(brewer.pal(40,'Set3')))
hexbinplot(diamonds$price~diamonds$carat, data=diamonds, colramp=rf)



Mosaic Plot

data(HairEyeColor)
mosaicplot(HairEyeColor)




Map Visualization

devtools::install_github("rstudio/leaflet")

library(magrittr)
library(leaflet)
m <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  addMarkers(lng=88.3630400, lat=22.5626300)
m  # Print the map





3D Graph

library(Rcmdr)
data(iris, package="datasets")
scatter3d(Petal.Width~Petal.Length+Sepal.Length|Species, data=iris, fit="linear", residuals=TRUE, parallel=FALSE, bg="white", axis.scales=TRUE, grid=TRUE, ellipsoid=FALSE)





library(lattice)
attach(iris)# 3d scatterplot by factor level

cloud(Sepal.Length~Sepal.Width*Petal.Length|Species, main="3D Scatterplot by Species")



xyplot(Sepal.Width ~ Sepal.Length, iris, groups = iris$Species, pch= 20)




















Wednesday, 11 January 2017

Working with GGPlot

What is the Grammar of Graphics?

The basic idea behind it is to independently specify plot building blocks and combine them to create just about any kind of graphical display you want.

GGPlot2

The ggplot2 is a package that offers a powerful graphics language for creating elegant and complex plots. Originally based on Leland Wilkinson's The Grammar of Graphics, ggplot2 allows you to create graphs that represent both univariate and multivariate numerical and categorical data in a straightforward manner.

#install & load ggplot library
install.package("ggplot2")
library("ggplot2")

### show info about the data
head(diamonds)
head(mtcars)



### comparison qplot vs ggplot
# qplot histogram
qplot(clarity, data=diamonds, fill=cut, geom="bar")



### how to use qplot
# scatterplot
qplot(wt, mpg, data=mtcars)


# transform input data with functions
qplot(log(wt), mpg - 10, data=mtcars)



# add aesthetic mapping (hint: how does mapping work)qplot(wt, mpg, data=mtcars, color=qsec)


# change size of points (hint: color/colour, hint: set aesthetic/mapping)
qplot(wt, mpg, data=mtcars, color=qsec, size=3)



#use alpha blending
qplot(wt, mpg, data=mtcars, alpha=qsec)



#continuous scale vs. discrete scale
head(mtcars)
qplot(wt, mpg, data=mtcars, colour=cyl)



levels(mtcars$cyl)
qplot(wt, mpg, data=mtcars, colour=factor(cyl))



# combine mappings (hint: hollow points, geom-concept, legend combination)
qplot(wt, mpg, data=mtcars, size=qsec, color=factor(carb))



qplot(wt, mpg, data=mtcars, size=qsec, color=factor(carb), shape=I(1))



qplot(wt, mpg, data=mtcars, size=qsec, shape=factor(cyl), geom="point")



qplot(wt, mpg, data=mtcars, size=factor(cyl), geom="point")



# bar-plot
qplot(factor(cyl), data=mtcars, geom="bar")



# flip plot by 90°
qplot(factor(cyl), data=mtcars, geom="bar") + coord_flip()




# difference between fill/color bars
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(cyl))



qplot(factor(cyl), data=mtcars, geom="bar", colour=factor(cyl))



# fill by variable
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(gear))



# use different display of bars (stacked, dodged, identity)
head(diamonds)
qplot(clarity, data=diamonds, geom="bar", fill=cut, position="stack")



# histogram
qplot(carat, data=diamonds, geom="histogram")



# change binwidth
qplot(carat, data=diamonds, geom="histogram", binwidth=0.01)



# use geom to combine plots (hint: order of layers)
qplot(wt, mpg, data=mtcars, geom=c("point", "smooth"))



qplot(wt, mpg, data=mtcars, color=factor(cyl), geom=c("point", "smooth"))



# using ggplot-syntax with qplot (hint: qplot creates layers automatically)
qplot(mpg, wt, data=mtcars, color=factor(cyl), geom="point") + geom_line()





# add an additional layer with different mapping

p.tmp + geom_point()

p.tmp + geom_point() + geom_point(aes(y=disp))


# dealing with overplotting (hollow points, pixel points, alpha[0-1] )

t.df <- data.frame(x=rnorm(2000), y=rnorm(2000))

p.norm <- ggplot(t.df, aes(x,y))

p.norm + geom_point()



p.norm + geom_point(shape=1)



p.norm + geom_point(shape=".")



p.norm + geom_point(colour=alpha("blue", 1/10))



# using scales (color palettes, manual colors, matching of colors to values)

p.tmp <- qplot(cut, data=diamonds, geom="bar", fill=cut)

p.tmp

RColorBrewer::display.brewer.all()




p.tmp + scale_fill_manual(values=c("#7fc6bc","#083642","#b1df01","#cdef9c","#466b5d"))





### create a pie-chart, radar-chart (hint: not recommended)

# map a barchart to a polar coordinate system

p.tmp <- ggplot(mtcars, aes(x=factor(1), fill=factor(cyl))) + geom_bar(width=1)

p.tmp

p.tmp + coord_polar(theta="y")



p.tmp + coord_polar()





ggplot(mtcars, aes(factor(cyl), fill=factor(cyl))) + geom_bar(width=1) + coord_polar()