My Notes


Nothing special, only my notes.

PROBLEM: if you try to execute with a SSIS Data Flow a stored procedure containing a temporary table, you receive this error message:

Exception from HRESULT: 0xC020204A
Error at From SQL Server to Excel [OLE DB Source [91]]: SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "The metadata could not be determined because statement 'INSERT INTO #TempTable [...]".
Error at From SQL Server to Excel [OLE DB Source [91]]: Unable to retrieve column information from the data source. Make sure your target table in the database is available.


SOLUTION: It's incredible that on Stackoverflow and various forums, the answer to this problem is "SSIS doesn't get along with temporary tables, you have to use tables variables or CTEs".
You can use temp table, you only have to specify to SSIS the resulting dataset of your stored procedure with the WITH RESULT SETS command:

    EXECUTE YourSP
    WITH RESULT SETS
    ((    
        One bigint NOT NULL, 
        Two varchar(35) NULL, 
        Three char(6) NULL
    ))

That's all.

GOAL: send an html email using Sql Server Integration Service

PROBLEM: the Send Mail Task in SSIS allows only email in text format and not in html format.

SOLUTION: use a C# script instead the Send Mail Task to create and send the email.
1) Add a Script Task to the solution and choose C# as language
2) Open the task and click on Edit Script
3) Under the "region namespaces" part, add this namespace:

   using System.Net.Mail;

4) After the "// TODO: Add your code here" add the following code (replace the highlighted part with your parameters):

   // Create email
   MailMessage msg = new MailMessage();

   // From
   msg.From = new MailAddress("example@gmail.com");
   string user = "massimiliano.figini";
   string password = "***************";

   // To
   msg.To.Add("example@yahoo.com");

   // CC
   msg.To.Add("example2@yahoo.it");

   // Subject
   msg.Subject = "Email subject test";

   // Text in html
   msg.Body = "<html><body><h1>Test</h1><p>Test <b>test</b> message.</p></body></html>";
   msg.IsBodyHtml = true;

   // Attachment
   msg.Attachments.Add(new Attachment("C:\\Test.txt"));

   // Send email
   SmtpClient smtp = new SmtpClient();
   smtp.Host = "smtp.gmail.com";
   smtp.Port = 587;
   smtp.UseDefaultCredentials = false;
   smtp.Credentials = new System.Net.NetworkCredential(user, password);
   smtp.EnableSsl = true;
   smtp.Send(msg);




GOAL: move data from SQL Server to Excel with SQL Server Integration Services.

PROBLEM: the package doesn't work and you receive the error "cannot convert between unicode and non-unicode"


import matplotlib.pyplot as plt
import numpy as np

# my dataframe is df

# plot for see if there is a linear relation
plt.scatter(col1, col2)
plt.show()

# create training and test data set
rand = np.random.rand(len(df)) < 0.8
train = cdf[rand]
test = cdf[~rand]

# MODEL
from sklearn import linear_model
regr = linear_model.LinearRegression()
train_x = np.asanyarray(train[['col1']])
train_y = np.asanyarray(train[['col2']])
regr.fit (train_x, train_y)

# coefficients
print ('Coefficients: ', regr.coef_)
print ('Intercept: ',regr.intercept_)

# plot output
plt.scatter(train.col1, train.col2)
plt.plot(train_x, regr.coef_[0][0]*train_x + regr.intercept_[0], color='red')


# EVALUATION
from sklearn.metrics import r2_score

test_x = np.asanyarray(test[['col1']])
test_y = np.asanyarray(test[['col2']])
test_y_ = regr.predict(test_x)

print("Mean absolute error: %.2f" % np.mean(np.absolute(test_y_ - test_y)))
print("Residual sum of squares (MSE): %.2f" % np.mean((test_y_ - test_y) ** 2))
print("R2-score: %.2f" % r2_score(test_y_ , test_y) )


import folium

# define the world map
world_map = folium.Map()

# display world map
world_map

# and you have a world map.
# You can customize this default definition of the world map by specifying
# the centre of your map and the intial zoom level.

# define the world map centered around Canada with a low zoom level
world_map = folium.Map(location=[41.53, 12.28], zoom_start=6)

# display world map
world_map



# if you want to display the physical map
world_map = folium.Map(location=[41.53, 12.28], zoom_start=6, tiles='Stamen Terrain')

# High contrast black and white map
world_map = folium.Map(location=[41.53, 12.28], zoom_start=6, tiles='Stamen Toner')

# add a marker with a pop-up label
binzago_map = folium.Map(location=[41.53, 12.28], zoom_start=6)
folium.features.CircleMarker(
        [45.62, 9.15],
        radius=10, # define how big you want the circle markers to be
        color='yellow',
        fill=True,
        popup=label,
        fill_color='red',
        fill_opacity=0.6
    ).add_child(folium.Popup('Binzago')).add_to(binzago_map)
binzago_map





import seaborn as sns

ax = sns.regplot(x='Col1', y='Col2', data=df_tot)

# we can use more features

import matplotlib as mpl
import matplotlib.pyplot as plt

plt.figure(figsize=(15, 10))     # bigger graph

sns.set(font_scale=2)     # bigger fonts
sns.set_style('whitegrid')     # white background

# make "+" green bigger marker (with marker, color e scatter_kws options)
ax = sns.regplot(x='Col1', y='Col2', data=df_tot, color='green', marker='+', scatter_kws={'s': 200})
ax.set(xlabel='COLUMN 1', ylabel='COLUMN 2')      # axis labels
ax.set_title('TITLE')       # graph title






import numpy as np

from PIL import Image     # converting images into arrays
import matplotlib as mpl
import matplotlib.pyplot as plt

# import package and its set of stopwords
from wordcloud import WordCloud, STOPWORDS

# remove stopwords
stopwords = set(STOPWORDS)

# we can add any word that we don't want visualize
stopwords.add('word')

# open the file and read it into a variable
words = open('words.txt', 'r').read()

# instantiate a word cloud object
new_wc = WordCloud(
    background_color='white',
    max_words=2000,
    stopwords=stopwords
)

# generate the word cloud
new_wc.generate(words)

# display the word cloud
plt.imshow(new_wc, interpolation='bilinear')
plt.axis('off')
plt.show()

# we can resize it
fig = plt.figure()
fig.set_figwidth(14)     # set width
fig.set_figheight(18)     # set height

# display the cloud
plt.imshow(new_wc, interpolation='bilinear')
plt.axis('off')
plt.show()


# We can use an image as background instead the rectangle when instantiate the object
new_img = np.array(Image.open('image.png'))

new_wc = WordCloud(
    background_color='white',
    max_words=2000,
    mask=new_img,
    stopwords=stopwords
)




import numpy as np 
import pandas as pd
from PIL import Image     # converting images into arrays
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches    # needed for waffle Charts
mpl.style.use('ggplot')     # optional: for ggplot-like style

df      # name of my table


# WAFFLE CHARTS

# Waffle charts are not built into any of the Python visualization libraries.
# So we have to create a function for it:

def create_waffle_chart(categories, values, height, width, colormap, value_sign=''):
    # compute the proportion of each category with respect to the total
    total_values = sum(values)
    category_proportions = [(float(value) / total_values) for value in values]
    # compute the total number of tiles
    total_num_tiles = width * height # total number of tiles
    print ('Total number of tiles is', total_num_tiles)
    
    # compute the number of tiles for each catagory
    tiles_per_category = [round(proportion * total_num_tiles) for proportion in category_proportions]
    
    # initialize the waffle chart as an empty matrix
    waffle_chart = np.zeros((height, width))
    # define indices to loop through waffle chart
    category_index = 0
    tile_index = 0
    # populate the waffle chart
    for col in range(width):
        for row in range(height):
            tile_index += 1
            # if the number of tiles populated for the current category 
            # is equal to its corresponding allocated tiles...
            if tile_index > sum(tiles_per_category[0:category_index]):
                # ...proceed to the next category
                category_index += 1       
            
            # set the class value to an integer, which increases with class
            waffle_chart[row, col] = category_index
    
    # instantiate a new figure object
    fig = plt.figure()
    # use matshow to display the waffle chart
    colormap = plt.cm.coolwarm
    plt.matshow(waffle_chart, cmap=colormap)
    plt.colorbar()
    # get the axis
    ax = plt.gca()
    # set minor ticks
    ax.set_xticks(np.arange(-.5, (width), 1), minor=True)
    ax.set_yticks(np.arange(-.5, (height), 1), minor=True)
    
    # add dridlines based on minor ticks
    ax.grid(which='minor', color='w', linestyle='-', linewidth=2)
    plt.xticks([])
    plt.yticks([])
    # compute cumulative sum of individual categories to match 
    # color schemes between chart and legend
    values_cumsum = np.cumsum(values)
    total_values = values_cumsum[len(values_cumsum) - 1]
    # create legend
    legend_handles = []
    for i, category in enumerate(categories):
        if value_sign == '%':
            label_str = category + ' (' + str(values[i]) + value_sign + ')'
        else:
            label_str = category + ' (' + value_sign + str(values[i]) + ')'
            
        color_val = colormap(float(values_cumsum[i])/total_values)
        legend_handles.append(mpatches.Patch(color=color_val, label=label_str))
    # add legend to chart
    plt.legend(
        handles=legend_handles,
        loc='lower center', 
        ncol=len(categories),
        bbox_to_anchor=(0., -0.2, 0.95, .1)
    )

# Now to create a waffle chart, all we have to do is call the function create_waffle_chart.
# Let's define the input parameters:

width = 40     # width of chart
height = 10     # height of chart
categories = df.index.values     # categories
values = df['Col1']      # correponding values of categories
colormap = plt.cm.coolwarm      # color map class

# Call the function
create_waffle_chart(categories, values, height, width, colormap)     # call the function