Jamie Thomson

Thoughts, about stuff

Posts Tagged ‘Python

Post messages to a Microsoft Teams incoming webhook from behind a proxy using Python

leave a comment »

The subject of this post is a bit of a mouthful but its going to do exactly what it says on the tin.

I’ve been playing with Microsoft Teams a lot over the past few days and I wanted to programatically post messages to a channel on Microsoft Teams using the language I’m using most often these days, Python. In my case it was made complicated by the presence of my company’s proxy server, I figured out a way around it and figured I’d post the code here for anyone that might be interested.

First you’ll need to create an incoming webhook. I’ll assume that you know what that means otherwise why would you be reading this? I’ll also assume you’ve already gotten your webhook URL as per the instructions at https://msdn.microsoft.com/en-us/microsoft-teams/connectors#create-the-webhook.

After that you just need to edit the code below with the address of your proxy server (I can’t help you with that), a username to authenticate to your proxy server, and your webhook URL. You will also need to have installed the requests library.

Here’s the code:


import requests #learn about requests: http://docs.python-requests.org/en/master/
from requests.auth import HTTPProxyAuth
from getpass import getpass

proxyDict = { 'http' : 'your.proxy.server:port', 'https' : 'your.proxy.server:port' }
webhook_url = "https://outlook.office.com/webhook/blahblahblah"
user = 'username'
password = getpass()
auth = HTTPProxyAuth(user, password)
json_payload = {'text' : 'hello world'}
requests.post(url=webhook_url, proxies=proxyDict, auth=auth, json=json_payload)

If it works you should see something like this posted to the channel for which you set up the incoming webhook:

2017-03-12_23-42-29

Good luck!

@jamiet

Advertisements

Written by Jamiet

March 12, 2017 at 11:45 pm

Posted in Uncategorized

Tagged with ,

Creating a Spark dataframe containing only one column

leave a comment »

I’ve been doing lots of Apache Spark development using Python (aka PySpark) recently, specifically Spark SQL (aka the dataframes API), and one thing I’ve found very useful to be able to do for testing purposes is create a dataframe from literal values. The documentation at pyspark.sql.SQLContext.createDataFrame() covers this pretty well however the code there describes how to create a dataframe containing more than one column like so:

l = [('Alice', 1)]
sqlContext.createDataFrame(l).collect()
# returns [Row(_1=u'Alice', _2=1)]
sqlContext.createDataFrame(l, ['name', 'age']).collect()
# returns [Row(name=u'Alice', age=1)]

For simple testing purposes I wanted to create a dataframe that has only one column so you might think that the above code could be amended simply like so:

l = [('Alice')]
sqlContext.createDataFrame(l).collect()
sqlContext.createDataFrame(l, ['name']).collect()

but unfortunately that throws an error:

TypeError: Can not infer schema for type: <type 'str'>

The reason is simple,

('Alice', 1)

returns a tuple whereas

('Alice')

returns a string.

type(('Alice',1)) # returns tuple
type(('Alice')) #returns str

The latter causes an error because createDataFrame() only creates a dataframe from a RDD of tuples, not a RDD of strings.

There is a very easy fix which will be obvious to any half-decent Python developer, unfortunately that’s not me so I didn’t stumble on the answer immediately. Its possible to create a one-element tuple by including an extra comma like so:

type(('Alice',)) # returns tuple

hence the earlier failing code can be adapted to this:

l = [('Alice',)]
sqlContext.createDataFrame(l).collect()
# returns [Row(_1=u'Alice')]
sqlContext.createDataFrame(l, ['name']).collect()
# returns [Row(name=u'Alice')]

It took me far longer than it should have done to figure that out!

Here is another snippet that creates a dataframe from literal values without letting Spark infer the schema (behaviour which, I believe, is deprecated anyway):

from pyspark.sql.types import *
schema = StructType([StructField("foo", StringType(), True)])
l = [('bar1',),('bar2',),('bar3',)]
sqlContext.createDataFrame(l, schema).collect()
# returns: [Row(foo=u'bar1'), Row(foo=u'bar2'), Row(foo=u'bar3')]

or, if you don’t want to use the one-element tuple workaround that I outlined above and would rather just pass a list of strings:

from pyspark.sql.types import *
from pyspark.sql import Row
schema = StructType([StructField("foo", StringType(), True)])
l = ['bar1','bar2','bar3']
rdd = sc.parallelize(l).map (lambda x: Row(x))
sqlContext.createDataFrame(rdd, schema).collect()
# returns [Row(foo=u'bar1'), Row(foo=u'bar2'), Row(foo=u'bar3')]

Happy sparking!

@Jamiet

Written by Jamiet

December 13, 2016 at 10:15 am

Posted in Uncategorized

Tagged with , , ,