Svan home archive about

Generating automatic YouTube video sitemap

I was playing with Google Webmaster tools and noticed the video sitemap option, and decided to generate one. After bit of searching I did come up with not many entries on how this could be quickly generated automatically.

So I had to manually create the sitemap for a single video based on the example schema, but then I realised this would be a nice quick python code if we could automated as much as possible.

Here is youtube_video_sitemap.py:

from xml.dom.minidom import Document
import gdata.youtube
import gdata.youtube.service

#http://code.google.com/apis/youtube/1.0/developers_guide_python.html#RetrievingVideoEntry
client = gdata.youtube.service.YouTubeService()
query = gdata.youtube.service.YouTubeVideoQuery()

############################
# Change these
location="http://www.ted.com"
changefreq="weekly"

query.max_results = 25
query.start_index = 1
query.author = "TEDtalksDirector"
############################

feed = client.YouTubeQuery(query)

doc = Document()
urlset  = doc.createElement("urlset")
urlset.setAttribute("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9")
urlset.setAttribute("xmlns:video", "http://www.sitemaps.org/schemas/sitemap/0.9")
doc.appendChild(urlset)

url = doc.createElement("url")

# Location  
elem = doc.createElement("loc")
elem_text = doc.createTextNode(location)
elem.appendChild(elem_text)
url.appendChild(elem)

for entry in feed.entry:
    # video
    video = doc.createElement("video:video")

    # video:thumbnail_loc
    elem = doc.createElement("video:thumbnail_loc")
    elem_text = doc.createTextNode(entry.media.thumbnail[0].url)
    elem.appendChild(elem_text)
    video.appendChild(elem)

    # video:title
    elem = doc.createElement("video:title")
    elem_text = doc.createTextNode(entry.media.title.text)
    elem.appendChild(elem_text)
    video.appendChild(elem)

    # video:description
    elem = doc.createElement("video:description")
    elem_text = doc.createTextNode(entry.media.description.text)
    elem.appendChild(elem_text)
    video.appendChild(elem)
    [...]

# end of url    
urlset.appendChild(url)
#print doc.toxml()
print doc.toprettyxml(indent="  ")

A video sitemap for TED talks, maybe next if I get the time I can code up an AppEngine app with a web ui.

Update 20110717: I wrote some still buggy javascript which you can generate a video sitemap in your browser

Aventail and Radius authentication

I recently had an interesting issue in getting a SonicWall Aventail EX series device to communicate with a Radius server (Vasco).

After a bit of trouble shooting it turned out that the Radius authentication response was being dropped by the iptables running on the appliance it self as tcpdump showed the Radius response message was arriving at the appliance, but the error log under the Aventail Mananagement Console) was showing the the Radius server failed to respond.

The file /var/log/kern.iptables file was logging the dropped packets. Search around Sonicwall's website did not reveal anything useful. Iptable's rules regarding Radius traffic:

aventail:/var/log# iptables -L -n | grep RADIUS
Chain RADIUS_FILTER (1 references)
RADIUS_FILTER  udp  --  0.0.0.0/0            0.0.0.0/0           udp spt:1645 

Poking around at the init scripts lead to the directory /var/lib/iptables containing the magic place which needed to be updated and changes to be re-loaded:

aventail:/var/lib/iptables# diff active.radius.fix active
237c237
< -A UDP_FILTER -p udp --sport 1645 -j RADIUS_FILTER
---
> -A UDP_FILTER -p udp --sport 1812 -j RADIUS_FILTER

aventail:/var/lib/iptables# /etc/init.d/iptables reload
Starting iptables: loaded active state

aventail:/var/lib/iptables# iptables -L -n | grep 1812
RADIUS_FILTER  udp  --  0.0.0.0/0            0.0.0.0/0           udp spt:1812 

Capture and later replay syslog events

I needed to capture syslog events going to a specific server and then to replay it later on a new server for testing purposes, the way I have achieved this is by use of Wireshark and tcpreplay.

Wireshark's command line interface would capture for 3 hours of syslog events coming on interface number 3 and place the captured data to a file name such as 28_03_2011_12_56. Because the capture command is in a loop, the end result is self rotating capture files.

@echo off
:TOP
set MY_DATE=%date:/=_%
set MY_HOUR="%time::=_%"
set FILE_NAME=%MY_DATE:~4%_%MY_HOUR:~1,5%

echo %FILE_NAME%

"c:\Wireshark\tshark.exe" -a duration:14400 -i 3 -f "udp port 514" -w "%FILE_NAME%"
REM PAUSE
GOTO TOP

What we can then do is re-write the layer 2 MAC address for a new destination server. This can be achieved with tcprewrite, example MAC address of aa:bb:cc:dd:11:22:

$ tcprewrite --enet-dmac=aa:bb:cc:dd:11:22 --infile=syslog_capture --outfile=syslog_rewrite

The pcap capture file will be updated and we'll need to now re-play it:

$ tcpreplay --pps=10 --intf1=xl0 syslog_rewrite
sending out xl0
processing file: syslog_rewrite

Actual: 1591 packets (407038 bytes) sent in 159.12 seconds
Rated: 2558.0 bps, 0.02 Mbps/sec, 10.00 pps

Statistics for network device: xl0
        Attempted packets:   1591
        Successful packets:  1591
        Failed packets:      0
        Retried packets:     0