Click here to Skip to main content
15,860,943 members
Articles / General Programming / Architecture

LINQ Script: A Universal Object-Oriented Database Query Language

Rate me:
Please Sign up or sign in to vote.
4.68/5 (9 votes)
6 Feb 2014CPOL14 min read 28.1K   613   19  
Using LINQ script to query the Object-Oriented Biological database

Background

The object-relationship database is the most popular kind of database in the world nowadays. And as we all know we can use the SQL script to query these o-r databases. But with the internet industry development, the o-r database will be not feet the data storage requirements in the feature and the object-oriental database product is underdevelopment to meet the data storage requirements. The o-o database is not applied in the successful yet.

But the object-oriental database is successfully applied in the bioinformatics research area: almost every famous biological database is an o-o database, and these databases served billions of biological researchers to support their scientific research projects in the world for free! Although almost all of these databases were stores biological data in the TEXT file format and in big size.

And is there a universal query language to query these biological databases? None, which means each kind of these databases, have its own way to query the database! From the programming experience in my university study time, I have found that the LINQ is the best query language for these o-o database, but unfortunately we must using the LINQ in the code and then compiled our code in the Visual Studio, which means LINQ is integrated with our VisualBasic/C# code, LINQ cannot run alone side with the program and LINQ is not a script language like SQL, it just a language syntax of VisualBasic/c# language. And of course not all of the biological researchers are use Visual Studio or install Visual Studio on their computer.

So from my work in this article, I make the LINQ as a universal script for query these biological databases and other object-oriental database.

Introduction

Basic knowledge to LINQ

Language-Integrated Query (LINQ) is an innovation introduced in Visual Studio 2008 and .NET Framework version 3.5 that bridges the gap between the world of objects and the world of data. Microsoft divided the LINQ into 4 kinds: (SQL Server databases) LINQ to SQL, (XML documents) LINQ to XML, (ADO.NET Datasets) LINQ to Dataset, (.NET collections, files, strings and so on) LINQ to Objects.

But In commonly, we divided the LINQ into 3 kinds: LINQ to SQL, LINQ to XML, and LINQ to Object. The LINQ to object is the most use type of LINQ in the .NET language programming.

Here is a typically LINQ to object query statement syntax example:

VB
Dim Collection As Generic.IEnumerable(Of <TYPE>) =
From object As <TYPE> In <TYPE Object Collection>[.AsParallel]
[Let readonlyObject1 = expression1 Let readonlyObject2 = expression2 …] 
Where <Boolean Expression> 
[Let readonlyObject3 = expression3 …]
Select <Expression>
[Distinct Order By <Expression> Ascending|Descending]

Usually any For loop in VisualBasic/C# can be convert into a LINQ statement, which means our program is a query statement base program, because a For loop consists most of the data query operation. And actually we can easily make our program a parallel edition from the .AsParallel extension method.

Introduce knowledge: CodeDom compiler

The CodeDOM provides types that represent many common types of source code elements. You can design a program that builds a source code model using CodeDOM elements to assemble an object graph. This object graph can be rendered as source code using a CodeDOM code generator for a supported programming language. The CodeDOM can also be used to compile source code into a binary assembly.

From my point of view, the System.CodeDOM namespace mainly contains two kinds of object:

  1. Classes for representing the logical structure of source code, independent of language syntax in the System.CodeDOM namespace. You can build a source code object model for your own programming language using the class in this namespace!
  2. Classes for compile the source code object model into a binary assembly file. Actually the class in this namespace just a command line interop tools for .NET program compiler: vbc.exe or csc.exe.

So using the CodeDom class object, your can build your own language compiler, compile your language into a .Net program. The difficult job is just writing a parser for parsing the source code in your language into the language independent CodeDOM elements correctly. Then compile a binary assembly just easy with CodeDOM.

The compile procedure of a .Net program using a user define compiler:

Image 1

Picture1. Workflow of your custom language compiler using CodeDom

Maybe many of the .NET programmers have uses the offline code convention feature in the Sharp Develop. This feature implementation is in the same way as the CodeDom does:

Source Code (in VisualBasic) -> Parsing into CodeDom Object Model -> another format of code (Boo, c#, Python, Ruby)

Image 2

Picture2. Sharp Develop code convert function feature.

Using the CodeDom

Image 3

Picture3. What consist of your .Net program?

Usually, a .net program is mainly consist of serval class object definition because of the any .NET language is fully object-oriental programming language. When you have understood what consist of your .net program, then you can start to model your source code using CodeDom. The modelling procedure of a .NET program using CodeDom is just easy as you following the steps from the source code coding way:

1. Define a Assembly module

VB
Dim Assembly As CodeDom.CodeCompileUnit = New CodeDom.CodeCompileUnit

A piece of code compile unit is stands for your application assembly file. You can using this object to compile a EXE program or a DLL library module.

2. Declare namespace

VB
Dim CodeNameSpace As CodeDom.CodeNamespace = New CodeDom.CodeNamespace("NamespaceName")
VB
Call Assembly.Namespaces.Add(CodeNameSpace)

You always need a namespace to contain the class object.

3. Add class object into your program

VB
Dim DeclareClassType As CodeDom.CodeTypeDeclaration = New CodeDom.CodeTypeDeclaration(ClassName)

Then you can use this statement to define a class object with a specific name ClassName, and then adding this new class into the namespace which you have declare previous

VB
CodeNameSpace.Types.Add(DeclareClassType)

4. Add some method function and property into the class

Then finally, you can add some property, field and method function into your defined class type in the step 3 to implement a specific function:

VB
Dim CodeMemberMethod As CodeDom.CodeMemberMethod = New CodeDom.CodeMemberMethod()
CodeMemberMethod.Name = Name : CodeMemberMethod.ReturnType = 
New CodeDom.CodeTypeReference(Type)            
CodeMemberMethod.Statements.Add(New CodeDom.CodeVariableDeclarationStatement(Type, "rval")) CodeMemberMethod.Statements.AddRange(StatementsCollectionInTheFunction)
CodeMemberMethod.Statements.Add(New CodeDom.CodeMethodReturnStatement(New CodeDom.CodeVariableReferenceExpression("rval")))  '引用返回值的局部变量

At Last, after you adding this declared function into your declared class object, you almost have a complete .NET program.

VB
Call DeclareClassType.Members.Add(CodeMemberMethod)

You can use this function to get the auto generated source code from the CodeDom object model, just easy Smile | :) , have fun:

VB
''' <summary>
''' Generate the source code from the CodeDOM object model.
''' (根据对象模型生成源代码以方便调试程序)
''' </summary>
''' <param name="NameSpace"></param>
''' <param name="CodeStyle">VisualBasic, C#</param>
''' <returns></returns>
''' <remarks>
''' You can easily convert the source code between VisualBasic and C# using this function just by makes change in statement: 
''' CodeDomProvider.GetCompilerInfo("VisualBasic").CreateProvider().GenerateCodeFromNamespace([NameSpace], sWriter, Options)
''' Modify the VisualBasic in to C#
''' </remarks>
Public Shared Function GenerateCode([NameSpace] As CodeDom.CodeNamespace, Optional CodeStyle As String = "VisualBasic") As String
     Dim sBuilder As StringBuilder = New StringBuilder()


     Using sWriter As IO.StringWriter = New System.IO.StringWriter(sBuilder)
         Dim Options As New CodeGeneratorOptions() With {
             .IndentString = "  ", .ElseOnClosing = True, .BlankLinesBetweenMembers = True}
         CodeDomProvider.GetCompilerInfo(CodeStyle).CreateProvider().GenerateCodeFromNamespace([NameSpace], sWriter, Options)
         Return sBuilder.ToString()
     End Using

End Function

The CodeDOM compiler currently just support VisualBasic and C# language from my test, the F# language is not support yet as I trying using F# or FSharp as a keyword, the CodeDOM compiler gives me a codedomprovider not found exception.

And use this function to compile the CodeDOM object model into a binary assembly file (EXE/DLL):

VB
''' <summary>
''' Compile the codedom object model into a binary assembly module file.
''' (将CodeDOM对象模型编译为二进制应用程序文件)
''' </summary>
''' <param name="ObjectModel">CodeDom dynamic code object model.(目标动态代码的对象模型)</param>
''' <param name="Reference">Reference assemby file path collection.(用户代码的引用DLL文件列表)</param>
''' <param name="DotNETReferenceAssembliesDir">.NET Framework SDK</param>
''' <param name="CodeStyle">VisualBasic, C#</param>
''' <returns></returns>
''' <remarks></remarks>
Public Shared Function Compile(ObjectModel As CodeDom.CodeCompileUnit, Reference As String(), DotNETReferenceAssembliesDir As String, Optional CodeStyle As String = "VisualBasic") As System.Reflection.Assembly
     Dim CodeDomProvider As CodeDom.Compiler.CodeDomProvider = 
         CodeDom.Compiler.CodeDomProvider.CreateProvider(CodeStyle)
     Dim Options As CodeDom.Compiler.CompilerParameters = 
         New CodeDom.Compiler.CompilerParameters


     Options.GenerateInMemory = True
     Options.IncludeDebugInformation = False
     Options.GenerateExecutable = False


     If Not Reference.IsNullOrEmpty Then
         Call Options.ReferencedAssemblies.AddRange(Reference)
     End If


     Call Options.ReferencedAssemblies.AddRange(New String() {
          DotNETReferenceAssembliesDir & "\System.dll",
          DotNETReferenceAssembliesDir & "\System.Core.dll",
          DotNETReferenceAssembliesDir & "\System.Data.dll",
          DotNETReferenceAssembliesDir & "\System.Data.DataSetExtensions.dll",
          DotNETReferenceAssembliesDir & "\System.Xml.dll",
          DotNETReferenceAssembliesDir & "\System.Xml.Linq.dll"})


      Dim Compiled = CodeDomProvider.CompileAssemblyFromDom(Options, ObjectModel)
      Return Compiled.CompiledAssembly

End Function

The parameter named DotNETReferenceAssembliesDir is a directory for the .NET framework assembly file reference, because from the testing, we found that the .NET Framework assembly file reference is different between the Win7 and Win8, so I using this parameter to make the CodeDOM compiler work correctly. If you get any compile error something like the directory is error, you should check for this parameter first.

LINQ Script: Universal Object Query Framework

Image 5

Picture4. LINQ Framework Schema

LINQ framework workflow overview

How does this LINQ script module works? It works as the same as the compiled procedure of a .NET program. There are 3 namespace in the LINQFramework Project to implement this script feature:

Framework: This namespace contains a dynamic code compiler that compile the LINQ statement into an assembly using its object model which is define in the statement namespace, a LINQ framework interface(ILINQ interface and LQueryFramework) to play a role as Interop service for query entity, and a TypeRegistry for object type recognition.

Parser: The statement parser, the parser class in this namespace parsing the LINQ statement token into a set of CodeDom code statement Object Model. And with the great thanks to the admiring coding job of in this article:

http://www.codeproject.com/Articles/14383/An-Expression-Parser-for-CodeDom

Statement: The object model of a LINQ statement is defined here.

Image 6

Picture5. Source Code organization

How to dynamic compile a LINQ statement into Assembly?

The LINQ script is based on the dynamic code compile from CodeDOM. So here is an overview of the workflow of this LINQ Framework:

First, the user input LINQ query script was parsing by the LINQ statement object, and then each token element expression was parsing into the CodeDom code object model using Parser class.

Then the framework compile this LINQ statement Object model in to an assembly module, and dynamic load the element in the compiled module, finish the initialize operation of the LINQ framework.

At last, the LINQ framework performance the query operation using a specific LINQ entity.

Detail of the code implementation

Construct a LINQ statement object model

First, we look at the class types in the namespace of Statements in my upload project, there are some tokens element and a statement class in the namespace.

Image 7

Picture6. LINQ Statement Namespace

Each class object in the Statements.Tokens namespace is to the each statement token in the LINQ script:

Class Object LINQ Statement Token Information
ObjectDeclaration From object As Type You should specific the object type in this token in the LINQ script for load a correctly LINQ Entity module.
ObjectCollection In <Collection Expression> Collection Expression is a file path or database connection string
ReadOnlyObject Let object = <expression> The “Read-only” which means we have no chance to modify its value.
WhereCondition Where <Boolean Expression> Test the condition is True or not to decided execute the select token in this iteration or not
SelectConstruct Select <Expression> Get an object for the return result collection
Options   This feature not implement yet
Token   The BaseType for all of the class object in this namespace

Then we can use these token elements to construct a LINQ statement object model, here is a piece of class definition of the LINQ statement object model:

Namespace Statements

VB
''' <summary>
''' A linq statement object model.
''' </summary>
''' <remarks>
''' From [Object [As TypeId]]
''' In [Collection]
''' Let [Declaration1, Declaration2, ...]
''' Where [Condition Test]
''' Select [Object/Object Constrctor]
''' [Distinct]
''' [Order Statement]</remarks>
Public Class Statement : Inherits LINQ.Statements.Tokens.Token


    ''' <summary>
    ''' An object element in the target query collection.(目标待查询集合之中的一个元素)
    ''' </summary>
    ''' <remarks></remarks>
    Public Property [Object] As LINQ.Statements.Tokens.ObjectDeclaration
    ''' <summary>
    ''' Where test condition for the query.(查询所使用的Where条件测试语句)
    ''' </summary>
    ''' <value></value>
    ''' <returns></returns>
    ''' <remarks></remarks>
    Public Property ConditionTest As LINQ.Statements.Tokens.WhereCondition
    ''' <summary>
    ''' Target query collection expression, this can be a file path or a database connection string.
    ''' (目标待查询集合,值可以为一个文件路径或者数据库连接字符串)
    ''' </summary>
    ''' <value></value>
    ''' <returns></returns>
    ''' <remarks></remarks>
    Public Property Collection As LINQ.Statements.Tokens.ObjectCollection
    ''' <summary>
    ''' A read only object collection which were construct by the LET statement token in the LINQ statement.
    ''' (使用Let语句所构造出来的只读对象类型的对象申明集合)
    ''' </summary>
    ''' <value></value>
    ''' <returns></returns>
    ''' <remarks></remarks>
    Public Property ReadOnlyObjects As LINQ.Statements.Tokens.ReadOnlyObject()
    ''' <summary>
    ''' A expression for return the query result.(用于生成查询数据返回的语句)
    ''' </summary>
    ''' <value></value>
    ''' <returns></returns>
    ''' <remarks></remarks>
    Public Property SelectConstruct As LINQ.Statements.Tokens.SelectConstruct


    Friend _Tokens As String()
    Friend TypeRegistry As LINQ.Framework.TypeRegistry
    ''' <summary>
    ''' 本LINQ脚本对象所编译出来的临时模块
    ''' </summary>
    ''' <remarks></remarks>

    Friend ILINQProgram As System.Type

TypeRegistry component

The TypeRegistry class object is required for the external module loading. After the ObjectDeclaration class object was parsed, then the construct function will query the type registry and get the type information of the LINQ entity in the external module. After get the LINQ entity type information then object type information will be get, at last, the object type information was complete to parse and was compiled into the dynamic assembly module.

Image 8

Picture7. External Module dynamic loading procedure using TypeRegistry component

RegistryItem definition

Registry item record each LINQ entity type information In the external module:

VB
''' <summary>
''' item in the type registry table
''' </summary>
''' <remarks></remarks>
Public Class RegistryItem

     ''' <summary>
     ''' 类型的简称或者别称,即本属性为LINQEntity自定义属性中的构造函数的参数
     ''' </summary>
     ''' <value></value>
     ''' <returns></returns>
     ''' <remarks></remarks>
     <Xml.Serialization.XmlAttribute> Public Property Name As String
     ''' <summary>
     ''' 建议使用相对路径,以防止移动程序的时候任然需要重新注册方可以使用
     ''' </summary>
     ''' <value></value>
     ''' <returns></returns>
     ''' <remarks></remarks>
     <Xml.Serialization.XmlAttribute> Public Property AssemblyPath As String
     ''' <summary>
     ''' Full type name for the target LINQ entity type.(目标LINQEntity集合中的类型全称)
     ''' </summary>
     ''' <value></value>
     ''' <returns></returns>
     ''' <remarks></remarks>
     <Xml.Serialization.XmlAttribute>
     
Public Property TypeId As String

External module register

Before you run a LINQ script to query a database, you should registry the LINQ entity type information into the LINQ framework’s type registry, so that the LINQ framework is able to load the LINQ entity from the external assembly module correctly. Here is the function is preceding the external module register operation:

VB
''' <summary>
''' Registry the external LINQ entity assembly module in the LINQFramework
''' </summary>
''' <param name="AssemblyPath">DLL file path</param>
''' <returns></returns>
''' <remarks>查询出目标元素的类型定义并获取信息</remarks>
Public Function Register(AssemblyPath As String) As Boolean
    Dim Assembly As System.Reflection.Assembly = System.Reflection.Assembly.LoadFile(IO.Path.GetFullPath(AssemblyPath)) 'Load external module
    Dim ILINQEntityTypes As System.Reflection.TypeInfo() =
        LINQ.Framework.LQueryFramework.LoadAssembly(Assembly, Reflection.LINQEntity.ILINQEntity) 'Get type define informations of LINQ entity

    If ILINQEntityTypes.Count > 0 Then
        Dim LQuery As Generic.IEnumerable(Of TypeRegistry.RegistryItem) =
            From Type As System.Type In ILINQEntityTypes
            Select New TypeRegistry.RegistryItem With {
                .Name = Framework.Reflection.LINQEntity.GetEntityType(Type),
                .AssemblyPath = AssemblyPath,
                .TypeId = Type.FullName}        'Generate the resitry item for each external type

        For Each Item In LQuery.ToArray         'Update exists registry item or insrt new item into the table
            Dim Item2 As RegistryItem = Find(Item.Name)         '在注册表中查询是否有已注册的类型
            If Item2 Is Nothing Then
                Call ExternalModules.Add(Item)  'Insert new record.(添加数据)
            Else                                'Update exists data.(更新数据)
                Item2.AssemblyPath = Item.AssemblyPath
                Item2.TypeId = Item.TypeId
            End If
        Next
        Return True
    'I did't found any LINQ entity type define information, skip this dll assembly file
    Else
        Return False
    End If

End Function

Type Finding

Here I using a function named find to query the registry external LINQ entity type:

VB
''' <summary>
''' Return a registry item in the table using its specific name property.
''' (返回注册表中的一个指定名称的项目)
''' </summary>
''' <param name="Name"></param>
''' <returns></returns>
''' <remarks></remarks>
Public Function Find(Name As String) As TypeRegistry.RegistryItem
     For i As Integer = 0 To ExternalModules.Count - 1
     If String.Equals(Name, ExternalModules(i).Name, StringComparison.OrdinalIgnoreCase) Then
         Return ExternalModules(i)
     End If
     Next
     Return Nothing

End Function

Dynamic Compile

Statement tokenCodeDOM elementCompiled result
ObjectDeclarationCodeMemberField, CodeMemberMethodA field in the class ILINQProgram and a function named SetObject
ObjectCollection-External location as a ILINQCollection interface, not include in this dynamic compiled code.
ReadOnlyObjectCodeMemberField, CodeAssignStatementA field in the class ILINQProgram and a value assignment statement in the function SetObject
WhereConditionCodeMemberMethodTest function in the ILINQProgram class object
SelectConstructCodeMemberMethodSelectMethod function in the ILINQProgram class object
OptionsNot implement yet

So if we convert the LINQ statement into an object definition, it maybe looks like this:

Original LINQ query script:

VB
Dim LQuery As String = "from fasta as fasta in ""/home/xieguigang/BLAST/db/xcc8004.fsa"" " &
                       "let seq = fasta.sequence " &
                       "where regex.match(seq,""A{5}T{1}"").success & seq.length < 500 " &

                       "select system.string.format(""{0}{1}{2}{3}"", fasta.tostring, microsoft.visualbasic.vbcrlf, seq, microsoft.visualbasic.vbcrlf)"

A VisualBasic compiled version for this LINQ statement:

VB
Namespace LINQDynamicCodeCompiled
  Public Class ILINQProgram
    Public fasta As TestEntity.FASTA     ‘[from fasta as fasta] variable declaration statement token in the LINQ
    Public seq As Object  ‘[let seq = fasta.sequence] readonly object variables declaration statement in the LINQ
    Public Overridable Function Test() As Boolean ‘[Where <condition test>] condition test statement token in the LINQ
      Dim rval As Boolean
      rval = (regex.match(seq, "A{5}T{1}").success  _
            And (seq.length < 500))
      Return rval
    End Function
    Public Overridable Function SetObject(ByVal p As TestEntity.FASTA) As Boolean ‘This method is use for each loop iterator, get an object in the target collection and the use it to initialize each read-only object which is declared from the "let" statement.
      Dim rval As Boolean
      Me.fasta = p
      seq = fasta.sequence
      Return rval
    End Function
    Public Overridable Function SelectMethod() As Object ‘[Select <expression>] the select statement token are using for return value collection which was generate from this LINQ statement.
      Dim rval As Object
      rval = system.string.format("{0}{1}{2}{3}", fasta.tostring, microsoft.visualbasic.vbcrlf, seq, microsoft.visualbasic.vbcrlf)
      Return rval
    End Function
  End Class
End Namespace 

Deal with the situation without where condition expression

Sometimes the LINQ statement have no “where condition test” because we just want a batch object converting function using the select method, so how to deal with this situation? I create an empty Test function and make it always return value true. If the statement parsing where condition test expression and get a null CodeExpression then the StatementCollection object in the compile function below will be empty, and then in another function assign the rval variable value to TRUE, so that we will not modify the LINQ object model structure in the feature.

VB
Public Function Compile() As CodeDom.CodeTypeMember
     Dim StatementCollection As CodeDom.CodeStatementCollection = Nothing


     If Not Statement.ConditionTest.Expression Is Nothing Then
         StatementCollection = New CodeDom.CodeStatementCollection
         StatementCollection.Add(New CodeDom.CodeAssignStatement(
             New CodeDom.CodeVariableReferenceExpression("rval"), 
             Statement.ConditionTest.Expression))
     End If
     Dim [Function] As CodeDom.CodeMemberMethod = 
        DynamicCode.VBC.DynamicCompiler.DeclareFunction(FunctionName, 
            "System.Boolean", StatementCollection)
     [Function].Attributes = CodeDom.MemberAttributes.Public
     Return [Function]
End Function


''' <summary>
''' Declare a function with a specific function name and return type. please notice that in this newly 
''' declare function there is always a local variable name rval using for return the value.
''' (申明一个方法,返回指定类型的数据并且具有一个特定的函数名,请注意,在这个新申明的函数之中,
''' 固定含有一个rval的局部变量用于返回数据)
''' </summary>
''' <param name="Name">Function name.(函数名)</param>
''' <param name="Type">Function return value type.(该函数的返回值类型)</param>
''' <returns>A codeDOM object model of the target function.(一个函数的CodeDom对象模型)</returns>
''' <remarks></remarks>
Public Shared Function DeclareFunction(Name As String, Type As String, Statements As CodeDom.CodeStatementCollection) As CodeDom.CodeMemberMethod
     Dim CodeMemberMethod As CodeDom.CodeMemberMethod = New CodeDom.CodeMemberMethod()
     '创建一个名为"WhereTest",返回值类型为Boolean的无参数的函数    

     CodeMemberMethod.Name = Name : CodeMemberMethod.ReturnType = 
        New CodeDom.CodeTypeReference(Type)  
     If String.Equals(Type, "System.Boolean", StringComparison.OrdinalIgnoreCase) Then
         CodeMemberMethod.Statements.Add(
            New CodeDom.CodeVariableDeclarationStatement(Type, "rval", 
            New CodeDom.CodePrimitiveExpression(True))) '创建一个用于返回值的局部变量,对于逻辑值,默认为真
      Else
         CodeMemberMethod.Statements.Add(
            New CodeDom.CodeVariableDeclarationStatement(Type, "rval")) '创建一个用于返回值的局部变量
      End If


      If Not (Statements Is Nothing OrElse Statements.Count = 0) Then
          CodeMemberMethod.Statements.AddRange(Statements)
      End If
      CodeMemberMethod.Statements.Add(
        New CodeDom.CodeMethodReturnStatement(
        New CodeDom.CodeVariableReferenceExpression("rval")))  '引用返回值的局部变量

      Return CodeMemberMethod

End Function

Create a LINQ Entity for your User

Every LINQ entity object should implement an interface: LINQ.Framework.ILINQCollection, here is the ILINQCollection interface definition:

Namespace Framework

VB
    ''' <summary>
    ''' LINQ Entity
    ''' </summary>
    ''' <remarks></remarks>
    Public Interface ILINQCollection

        ''' <summary>
        ''' Get a Collection of the target LINQ entity.(获取目标LINQ实体对象的集合)
        ''' </summary>
        ''' <param name="Statement"></param>
        ''' <returns></returns>
        ''' <remarks></remarks>
        Function GetCollection(Statement As LINQ.Statements.LINQStatement) As Object()

        ''' <summary>
        ''' Get the type information of the element object in the linq entity collection.
        ''' (获取LINQ实体集合中的元素对象的类型信息)
        ''' </summary>
        ''' <returns></returns>
        ''' <remarks></remarks>
        Function GetEntityType() As System.Type
    End Interface
End Namespace

Why are we needs to do this?

To get the data source collection and declare a data type in the LINQ entity collection. From the GetCollection function in the interface object, we are able to load data from the target collection expression in the LINQ query script, and from the GetEntityType function we are able to declare an object in the dynamic compiled LINQ query class object.

How to?

First, just create an empty class object then implement this interface, the Visual Studio will automatic add the empty method into your class. Lets’ sees a very simple example:

VB
<LINQ.Framework.Reflection.LINQEntity("member")>
<Xml.Serialization.XmlType("doc")> Public Class ExampleXMLCollection
    Implements LINQ.Framework.ILINQCollection

    Public Property members As List(Of member)

    Public Function GetCollection(Statement As LINQ.Statements.LINQStatement) As Object() Implements LINQ.Framework.ILINQCollection.GetCollection
        Dim xml = Statement.Collection.Value.ToString.LoadXml(Of ExampleXMLCollection)()
        Me.members = xml.members
        Return members.ToArray
    End Function

    Public Function GetEntityType() As Type Implements LINQ.Framework.ILINQCollection.GetEntityType
        Return GetType(member)
    End Function

End Class

In the simple example show above, the GetCollection function define the loading method of the object collection, which is the object-oriental database loading procedure. And from the GetEntityType function the LINQFramework is able to know that the object in the collection is a member type. So that the LINQFramework can create a query instance correctly.

User Query O-O Database Using LINQ Framework

Query steps:

  1. Load a LINQ entity object type from the external compiled assembly module and the loading type information is comes from the object declaration class object in the statements.tokens namespace.
  2. Then create an instance of the loaded LINQ entity type using Activator.CreateInstance function. Then we are able to load the object collection using this LINQ entity.
  3. In this step, I initialize the entire statement token in the LINQ statement object model: which means we are using the reflection operation to get the method information from the loaded LINQ entity type and then we are able to create the Lambda expression for the query operation.
  4. Then we have all of the needed elements to build a LINQ query object model in the code, using a LINQ statement code to execute the object-oriental database query operation.
  5. Return the query result to the user

Finally, we are able to query an object-oriental database using the LINQ script through this query function:

VB
''' <summary>
''' Execute a compiled LINQ statement object model to query a object-orientale database.
''' </summary>
''' <param name="Statement"></param>
''' <returns></returns>
''' <remarks>
''' Dim List As List(Of Object) = New List(Of Object)
''' 
''' For Each [Object] In LINQ.GetCollection(Statement)
'''    Call SetObject([Object])
'''    If True = Test() Then
'''        List.Add(SelectConstruct())
'''    End If
''' Next
''' Return List.ToArray
''' </remarks>
Public Function EXEC(Statement As LINQ.Statements.LINQStatement) As Object()
    'Create a instance for the LINQ entity and intialzie the components 
    Dim StatementInstance = Statement.CreateInstance   
    Dim LINQ As ILINQCollection = Statement.Collection.ILINQCollection ' 
    'Construct the Lambda expression 
    Dim Test As System.Func(Of Boolean) = 
        Function() Statement.ConditionTest.TestMethod.Invoke(StatementInstance, Nothing)  
    Dim SetObject As System.Func(Of Object, Boolean) = 
        Function(p As Object) Statement.Object.SetObject.Invoke(StatementInstance, {p})
    Dim SelectConstruct As System.Func(Of Object) = 
        Function() Statement.SelectConstruct.SelectMethod.Invoke(StatementInstance, Nothing)


    Dim LQuery = From [Object] As Object In LINQ.GetCollection(Statement)
                 Let f = SetObject([Object])
                 Where True = Test()
                 Let t = SelectConstruct()
                 Select t 'Build a LINQ query object model using the constructed elements
    Return LQuery.ToArray 'return the query result

End Function

Code usage summary: o-o database query feature implementation steps

First, create a LINQ entity, in this step you just specific a costume attribute in your project:

VB
<LINQ.Framework.Reflection.LINQEntity("EntityName")>

This attribute for the object type that which is generate a collection for the target object mapping type. And sure of course, this target object type must implement the ILINQ interface which is defined in the namespace: Global.LINQ.Framework.

Then compile your LINQ entity project assembly module into a dll file.

Second, registry your compiled LINQ entity dll file in the LINQFramework using the code:

VB
Using LINQ As LINQ.Framework.LQueryFramework = New LINQ.Framework.LQueryFramework
            Call LINQ.TypeRegistry.Register("Dll Assembly Path")

            Call LINQ.TypeRegistry.Save()
…
End Using

Then you can get the LINQ query script from a textbox, which means the user input the LINQ query script from the GUI interface in your program, and compiled the LINQ script into a LINQ Object Model:

VB
Dim LQuery As String = <User input script string>

Dim Statement = Global.LINQ.Statements.Statement.TryParse(LQuery, LINQ.TypeRegistry)’LINQ script compiled into a LINQ object model.

At last, execute the LINQ script using LINQFramework to get a query result object collection:

VB
Dim result = LINQ.EXEC(Statement)

And do not forget return the result collection to your user.

Testing & example projects

Here I bring some example to show you the ability of LINQ as a universal query script to the object-oriental database; you can found the test example project in my test project: TestLINQEntity. Compile this test project and then copy the compiled assembly file into the WinForm test program root directory and then registry this TestLINQEntity.dll file.(Menu: File -> Registry External Module)

In the test form, method exe for execute a LINQ script query and RegistryExternalModuleToolStripMenuItem_Click method for register a new external module into the LINQ framework.

VB
''' <summary>
''' Execute the linq script
''' </summary>
''' <param name="Linq"></param>
''' <remarks></remarks>
Private Sub Exe(Linq As String)
     Dim Statement = 
        Global.LINQ.Statements.Statement.TryParse(Linq, LINQFramework.TypeRegistry)


     TextBox1.AppendText(String.Format("{0}{1}Auto-generated code for debug:{2}{3}{4}", vbCrLf, vbCrLf, vbCrLf, Statement.CompiledCode, vbCrLf))
     TextBox1.AppendText(vbCrLf & "Query Result:" & vbCrLf)


     Dim Collection = LINQFramework.EXEC(Statement)


     For Each obj In Collection
         Call TextBox1.AppendText(vbCrLf & obj.ToString & vbCrLf)
     Next
End Sub

''' <summary>
''' registry the LINQ entity external module
''' </summary>
''' <param name="sender"></param>
''' <param name="e"></param>
''' <remarks></remarks>
Private Sub RegistryExternalModuleToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles RegistryExternalModuleToolStripMenuItem.Click
    Dim File As New Windows.Forms.OpenFileDialog
    If File.ShowDialog = Windows.Forms.DialogResult.OK Then
        Call LINQFramework.TypeRegistry.Register(File.FileName)
        Call LINQFramework.TypeRegistry.Save()
    End If

End Sub

XML Query

In the TestLINQEnityt project, the source code file ExampleXML.vb defines a LINQ entity example for xml file query operation. From the menu Example -> Query XML to get an example LINQ query script, and then click execute button, you will get a result output from the xml file query operation using this LINQ Framework.

An example LINQ script for XML queries [Example -> Query XML]:

VB
from member as examplexml in "TEST_XML_FILE.xml" let name = member.name where microsoft.visualbasic.instr(name,"MetaCyc") select microsoft.visualbasic.mid(member.summary,1,20)

Sequence Pattern Search

Some gene in a bacteria genome are in an interesting sequence pattern and they are sometimes is a very important expression regulation gene, and this pattern can be found using a regular expression. Here is an example of how to query a FASTA database using LINQ script and find the genes using a specific sequence pattern.

An example LINQ script for sequence pattern search [Example -> sequence pattern search]:

VB
from fasta as fasta in ".\xcc8004.fsa" let seq = fasta.sequence let match = regex.match(seq,"A+T{2,}GCA+TT") where match.success select fasta.tostring + microsoft.visualbasic.vbcrlf + "***" + match.value + "***" 

Image 9

A known problem about the CodeDom compiler

There is a known issue that the compiled LINQ entity assembly files must put in the location in you program root directory, or when you performance the LINQ script query operation you will get a type missing exception: the dynamic compiled application could not found your compiled LINQ entity assembly file. I’m not sure whether this is the limitation of CodeDom or for another security consideration.

Image 10

Or such type’s miss match exception:

Image 11

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead PANOMIX
China China
He is good and loves VisualBasic! Senior data scientist at PANOMIX


github: https://github.com/xieguigang

Comments and Discussions

 
-- There are no messages in this forum --