Wednesday, 23 July 2014

Getting a list of files from a MOSS document library using a SharePoint web service


My challenge was simple.  I needed to develop an SSIS package that would download and extract data from every Excel file held in document libraries across several SharePoint sites.  SSIS was the natural choice as the data needed to be cleaned and validated before being imported into a database. However, SSIS is not great with web services – especially in the data flow. As I not worked with the SharePoint web services much, I started with a good old Console application.
 
MOSS, or more accurately, WSS provides a whole host of web services to obtain information about SharePoint sites.  However, figuring out which method to invoke and what parameters to pass is more problematic.  Especially as many of the parameters are chunks of Collaborative Application Mark-up Language (CAML) – a dialect of XML developed by Microsoft specifically for use with SharePoint.
 
A False Start
My first console app simply obtained the GUID of the document library using the GetListCollection() method of the Lists web service.  The GUID was then passed to the GetListItems() method which duly provided all documents and folders at the top level of the document library.  It then seemed logical to me to recursively call the GetListItems() method using the GUID of each sub-folder.  On no, how wrong could I be!  The GetListItems() method simply chokes on these folder GUIDs.
 
On searching the internet I found many other incorrect forum posts and blog entries about the same topic – but no working solutions.  I also made an extensive search of my eBook collection – but again no solutions – which overall motivated me to write this blog entry.
 
The solution - RTFM
Well, if I had read the whole page in the manual, I would have got to the solution earlier.  The key to my puzzle was the QueryOptions XML fragment which has both a Folder element and the all important <ViewAttributes Scope="Recursive" /> element. Using these elements together, it is possible to obtain a list of all documents in all subfolders in the list. Indeed, it does not even bother returning the subfolder details!
 
So here is the code for my working C# sample.
 
using System;
using System.Collections.Generic;
using System.Text;
using System.Xml;
using System.Web.Services;
using System.Web;
using System.Net;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
string siteUrl = @"http://yourserver/sites/yoursite";
string documentLibraryName = @"Shared Documents";
SharePointList.Lists wsList = new SharePointList.Lists();
wsList.Credentials = System.Net.CredentialCache.DefaultCredentials;
WebProxy proxyObj = new WebProxy("yourproxy", 80);
wsList.Proxy = proxyObj;
wsList.Url = siteUrl + @"/_vti_bin/lists.asmx";

// get a list of all top level lists
XmlNode allLists = wsList.GetListCollection();
// load into an XML document so we can use XPath to query content
XmlDocument allListsDoc = new XmlDocument();
allListsDoc.LoadXml(allLists.OuterXml);
// allListsDoc.Save(@"c:\allListsDoc.xml"); // for debug
XmlNamespaceManager ns = new XmlNamespaceManager(allListsDoc.NameTable);
ns.AddNamespace("d", allLists.NamespaceURI);

// now get the GUID of the document library we are looking for
XmlNode dlNode = allListsDoc.SelectSingleNode("/d:Lists/d:List[@Title='" + documentLibraryName + "']", ns);
if (dlNode == null)
{
Console.WriteLine("Document Library '{0}' not found!", documentLibraryName);
}
else
{
// obtain the GUID for the document library and the webID
string documentLibraryGUID = dlNode.Attributes["ID"].Value;
string webId = dlNode.Attributes["WebId"].Value;

Console.WriteLine("Opening folder '{0}' GUID={1}", documentLibraryName, documentLibraryGUID);

// create ViewFields CAML
XmlDocument viewFieldsDoc = new XmlDocument();
XmlNode ViewFields = AddXmlElement(viewFieldsDoc, "ViewFields", "");
AddFieldRef(ViewFields, "GUID");
AddFieldRef(ViewFields, "ContentType");
AddFieldRef(ViewFields, "BaseName");
AddFieldRef(ViewFields, "Modified");
AddFieldRef(ViewFields, "EncodedAbsUrl");
//viewFieldsDoc.Save(@"c:\viewFields.xml"); // for debug

// create QueryOptions CAML
XmlDocument queryOptionsDoc = new XmlDocument();
XmlNode QueryOptions = AddXmlElement(queryOptionsDoc, "QueryOptions", "");
AddXmlElement(QueryOptions, "Folder", documentLibraryName);
AddXmlElement(QueryOptions, "IncludeMandatoryColumns", "FALSE");
// this element is the key to getting the full recusive list
XmlNode node = AddXmlElement(QueryOptions, "ViewAttributes", "");
AddXmlAttribute(node, "Scope", "Recursive");
queryOptionsDoc.Save(@"c:\queryOptions.xml"); // for debug

// obtain the list of items in the document library
XmlNode listContent = wsList.GetListItems(documentLibraryGUID, null, null, ViewFields, null, QueryOptions, webId);

XmlDocument xmlResultsDoc = new XmlDocument();
xmlResultsDoc.LoadXml(listContent.OuterXml);
ns = new XmlNamespaceManager(xmlResultsDoc.NameTable);
ns.AddNamespace("z", "#RowsetSchema");
// xmlResultsDoc.Save(@"c:\listContent.xml"); // for debug

XmlNodeList rows = xmlResultsDoc.SelectNodes("//z:row", ns);
if (rows.Count == 0)
{
Console.WriteLine("No content found");
}
foreach (XmlNode row in rows)
{
Console.WriteLine(row.Attributes["ows_ContentType"].Value + " " + row.Attributes["ows_GUID"].Value + " :: " + row.Attributes["ows_BaseName"].Value);
}
}
Console.WriteLine("Done");
Console.Read();
}

public static XmlNode AddXmlElement(XmlNode parent, string elementName, string elementValue)
{
XmlNode element = parent.AppendChild(parent.OwnerDocument.CreateNode(XmlNodeType.Element, elementName, ""));
if (elementValue != "")
element.InnerText = elementValue;
return (element);
}

public static XmlNode AddXmlElement(XmlDocument parent, string elementName, string elementValue)
{
XmlNode element = parent.AppendChild(parent.CreateNode(XmlNodeType.Element, elementName, ""));
if (elementValue != "")
element.InnerText = elementValue;
return (element);
}

public static XmlNode AddXmlAttribute(XmlNode element, string attrName, string attrValue)
{
XmlNode attr = element.Attributes.Append((XmlAttribute)element.OwnerDocument.CreateNode(XmlNodeType.Attribute, attrName, ""));
if (attrValue != "")
attr.Value = attrValue;
return (attr);
}

public static void AddFieldRef(XmlNode viewFields, string fieldName)
{
    XmlNode fieldRef = AddXmlElement(viewFields, "FieldRef", "");
    AddXmlAttribute(fieldRef, "Name", fieldName);
}
}
}

0 comments:

Post a Comment