500 likes | 605 Views
Getting familiar with LINQ to Objects. Florin−Tudor Cristea, Microsoft Student Partner. Introducing our running example. “Never trust a computer you can’t throw out a window.” (Steve Wozniak). LinqBooks , a personal book-cataloging system.
E N D
Getting familiarwith LINQ to Objects Florin−Tudor Cristea, Microsoft Student Partner
Introducing our running example “Never trust a computer you can’t throw out a window.” (Steve Wozniak)
The main features LinqBooks should have include the ability to: • Track what books we have; • Store what we think about them; • Retrieve more information about our books; • Publish our list of books and our review information.
The technical features we’ll implement include: • Querying/inserting/updating data in a local database; • Providing search capabilities over both the local catalog and third parties (such as Amazon or Google); • Importing data about books from a web site; • Importing and persisting some data from/in XML documents; • Creating RSS feeds for the books you recommend.
In order to implement these features, we’ll use a set of business entities. The object model we’ll use consists of the following classes: Book, Author, Publisher, Subject, Review, and User. We’ll first use these objects in memory with LINQ to Objects, but later on we’ll have to persist this data in a database.
Using LINQ with in-memory collections “If Java had true garbage collection, most programs would delete themselves upon execution.” (Robert Sewell)
All that is required for a collection to be queryable through LINQ to Objects is that it implements the IEnumerable<T> interface. As a reminder, objects implementing the IEnumerable<T> interface are called sequences in LINQ vocabulary. The good news is that almost every generic collection provided by the .NET Framework implements IEnumerable<T>. This means that you’ll be able to query the usual collections you were already working with in .NET 2.0.
Arrays UntypedArray.csproj TypedArray.csproj Generic lists System.Collections.Generic.List<T> System.Collections.Generic.LinkedList<T> System.Collections.Generic.Queue<T> System.Collections.Generic.Stack<T> System.Collections.Generic.HashSet<T> System.Collections.ObjectModel.Collection<T> System.ComponentModel.BindingList<T> GenericList.csproj
Generic dictionaries System.Collections.Generic.Dictionary<TKey,TValue> System.Collections.Generic.SortedDictionary<TKey, TValue> System.Collections.Generic.SortedList<TKey, TValue> GenericDictionary.csproj String String.csproj
The nongeneric collections do not implement IEnumerable<T>, but implement IEnumerable. Does this mean that you won’t be able to use LINQ with DataSet or ArrayList objects, for example? Fortunately, solutions exist. Later we’ll demonstrate how you can query nongeneric collections thanks to the Cast and OfType query operators.
Here is an overview of the families of the standard query operators: Restriction, Projection, Partitioning, Join, Ordering, Grouping, Set, Conversion, Equality, Element, Generation, Quantifiers, and Aggregation. As you can see, a wide range of operations is supported.
Using LINQ with ASP.NET and Windows Forms “There are only two kinds of programming languages: those people always bitch about and those nobody uses.” (Bjarne Stroustrup)
ASP.NET controls support data binding to any IEnumerable collection. This makes it easy to display the result of language-integrated queries using controls like GridView, DataList, and Repeater. Step1.aspx We can use two methods to display only the properties we want: either declare specific columns at the grid level, or explicitly select only the Title and Price properties in the query. Step2a.aspx, Step2b.aspx
You can use an anonymous type to map your domain model to a presentation model. In the following query, creating an anonymous type allows a flat view of our domain model: from book in SampleData.Books where book.Title.Length > 10 orderby book.Price select new { book.Title, book.Price, Publisher = book.Publisher.Name, Authors = book.Authors.Count() };
Using LINQ in a Windows Forms application isn’t more difficult than with ASP.NET in a web application. We’ll see how we can do the same kind of databinding operations between LINQ query results and standard Windows Forms controls in a sample application. FormStrings.cs FormBooks.cs (DataPropertyName)
You should notice two things in comparison to the code we used for the ASP.NET web application sample. First, we use an anonymous type to create objects containing a Book property. This is because the DataGridView control displays the properties of objects by default. If we returned strings instead of custom objects, all we would see displayed would be the title’s Length, because that’s the only property on strings. Second, we convert the result sequence into a list. This is required for the grid to perform data binding. Alternatively, we could use a BindingSource object.
Focus on major standard query operators “There are two major products that come out of Berkeley: LSD and UNIX. We don’t believe this to be a coincidence.”(Jeremy S. Anderson)
Filtering: Where public static IEnumerable<T> Where<T>( this IEnumerable<T> source, Func<T, bool> predicate); public static IEnumerable<T> Where<T>( this IEnumerable<T> source, Func<T, int, bool> predicate);
IEnumerable<Book> books = SampleData.Books.Where(book => book.Price >= 15); IEnumerable<Book> books = SampleData.Books.Where( (book, index) => (book.Price >= 15) && ((index & 1) == 1));
Projection: Select public static IEnumerable<S> Select<T, S>( this IEnumerable<T> source, Func<T, S> selector); IEnumerable<String> titles = SampleData.Books.Select(book => book.Title);
Projection: SelectMany public static IEnumerable<S> SelectMany<T, S>( this IEnumerable<T> source, Func<T, IEnumerable<S>> selector); The SelectMany operator maps each element from the sequence returned by the selector function to a new sequence, and concatenates the results.
IEnumerable<IEnumerable<Author>> tmp = SampleData.Books .Select(book => book.Authors); foreach (var authors in tmp) { foreach (Author author in authors) { Console.WriteLine(author.LastName); } }
IEnumerable<Author> authors = SampleData.Books .SelectMany(book => book.Authors); foreach (Author author in authors) { Console.WriteLine(author.LastName); } from book in SampleData.Books from author in book.Authors select author.LastName
The Select and SelectMany operators can be used to retrieve the index of each element in a sequence. Let’s say we want to display the index of each book in our collection before we sort them in alphabetical order (SelectIndex.csproj): var books = SampleData.Books .Select((book, index) => new { index, book.Title }) .OrderBy(book => book.Title);
To remove duplication, we can use the Distinct operator. Distinct eliminates duplicate elements from a sequence. In order to compare the elements, the Distinct operator uses the elements’ implementation of the IEquatable<T>.Equals method if the elements implement the IEquatable<T> interface. It uses their implementation of the Object.Equals method otherwise (Distinct.csproj).
var authors = SampleData.Books .SelectMany(book => book.Authors) .Distinct() .Select(author => author.FirstName + " " + author.LastName);
Conversion: ToArray, ToList, ToDictionary ToArray and ToList are useful when you want to request immediate execution of a query or cache the result of a query. When invoked, these operators completely enumerate the source sequence on which they are applied to build an image of the elements returned by this sequence. Dictionary<String, Book> isbnRef = SampleData.Books.ToDictionary(book => book.Isbn); Book linqRules = isbnRef["0-111-77777-2"];
Aggregate: Count, Sum, Min, Max var minPrice = SampleData.Books.Min(book => book.Price); var maxPrice = SampleData.Books.Select(book => book.Price).Max(); var totalPrice = SampleData.Books.Sum(book => book.Price); var nbCheapBooks = SampleData.Books.Where(book => book.Price < 30).Count();
Creating views on an object graph in memory “19 Jan 2038 at 3:14:07 AM”(End of the word according to Unix–2^32 seconds after January 1, 1970)
Sorting Let’s say we’d like to view our books sorted by publisher, then by descending price, and then by ascending title (Sorting.aspx): from book in SampleData.Books orderby book.Publisher.Name, book.Price descending, book.Title select new { Publisher = book.Publisher.Name, book.Price, book.Title };
A query expression’s orderby clause translates to a composition of calls to the OrderBy, ThenBy, OrderByDescending, and ThenByDescending operators: SampleData.Books .OrderBy(book => book.Publisher.Name) .ThenByDescending(book => book.Price) .ThenBy(book => book.Title) .Select(book => new { Publisher = book.Publisher.Name, book.Price, book.Title });
Nested queries Let’s say we want to display publishers and their books in the same grid (Nested.aspx): from publisher in SampleData.Publishers select new { Publisher = publisher.Name, Books = from book in SampleData.Books where book.Publisher.Name == publisher.Name select book }
Grouping Using grouping, we’ll get the same result as with the previous sample except that we don’t see the publishers without books this time (Grouping.aspx): from book in SampleData.Books group book by book.Publisher into publisherBooks select new { Publisher = publisherBooks.Key.Name, Books = publisherBooks };
The publisherBooks group is an instance of the IGrouping<TKey, T> interface. Here is how this interface is defined: public interface IGrouping<TKey, T> : IEnumerable<T> { TKey Key { get; } }
You can see that an object that implements the IGrouping generic interface has a strongly typed key and is a strongly typed enumeration. In our case, the key is a Publisher object, and the enumeration is of type IEnumerable<Book>. • Advantages: • the query is shorter; • we can name the group.
from book in SampleData.Books group book by book.Publisher into publisherBooks select new { Publisher = publisherBooks.Key.Name, Books=publisherBooks, publisherBooks.Count() };
Group join Join operators allow us to perform the same kind of operations as projections, nested queries, or grouping do, but their advantage is that they follow a syntax close to what SQL offers (Joins.aspx): from publisher in SampleData.Publishers join book in SampleData.Books on publisher equals book.Publisher into publisherBooks select new { Publisher = publisher.Name, Books = publisherBooks };
This is a group join. It bundles each publisher’s books as sequences named publisherBooks. As with nested queries, publishers with no books appear in the results this time.
Inner join An inner join essentially finds the intersection between two sequences. With an inner join, the elements from two sequences that meet a matching condition are combined to form a single sequence: from publisher in SampleData.Publishers join book in SampleData.Books on publisher equals book.Publisher select new { Publisher=publisher.Name, Book=book.Title };
This query is similar to the one we used in the group join sample. The difference here is that we don’t use the into keyword to group the elements. Instead, the books are projected on the publishers. SampleData.Publishers .Join(SampleData.Books, // inner sequence publisher => publisher, // outer key selector book => book.Publisher, // inner key selector (publisher, book) => new { Publisher = publisher.Name, Book = book.Title }); // result selector
Left outer join When we want to keep all elements from the outer sequence, independently of whether there is a matching element in the inner sequence, we need to perform a left outer join.
from publisher in SampleData.Publishers join book in SampleData.Books on publisher equals book.Publisher into publisherBooks from book in publisherBooks.DefaultIfEmpty() select new { Publisher = publisher.Name, Book = book == default(Book) ? "(no books)" : book.Title };
Cross join A cross join computes the Cartesian product of all the elements from two sequences. from publisher in SampleData.Publishers from book in SampleData.Books select new { Correct = (publisher == book.Publisher), Publisher = publisher.Name, Book = book.Title };
SampleData.Publishers.SelectMany( publisher => SampleData.Books.Select( book => new { Correct = (publisher == book.Publisher), Publisher = publisher.Name, Book = book.Title }));
Partitioning Let’s say we want to display a maximum of three books on a page. This can be done easily using the GridView control’s paging features (Paging.aspx): <asp:GridView ID="GridView1" runat="server" AllowPaging="true“ PageSize="3" OnPageIndexChanging="GridView1_PageIndexChanging"></asp:GridView>
Here we use ToList in order to enable paging because a sequence doesn’t provide the necessary support for it. Paging is useful and easy to activate with the GridView control, but this does not have a lot to do with LINQ. The grid handles it all by itself.
Skip and Take When you want to keep only a range of the data returned by a sequence, you can use the two partitioning query operators: Skip and Take. The Skip operator skips a given number of elements from a sequence and then yields the remainder of the sequence. The Take operator yields a given number of elements from a sequence and then skips the remainder of the sequence. The canonical expression for returning page index n, given pageSize is: sequence.Skip(n * pageSize).Take(pageSize).
Let’s say we want to keep only a subset of the books. We can do this thanks to two combo boxes allowing us to select the start and end indices (Partitioning.aspx): SampleData.Books .Select((book, index) => new { Index = index, Book = book.Title }) .Skip(startIndex).Take(endIndex-startIndex+1);