User Tools

Site Tools


data_corpus
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="lexUnit.xsl"?>
<lexUnit xmlns="http://framenet.icsi.berkeley.edu" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" frame="Data" frameID="652" status="Finished_Initial" POS="N" name="data.n" ID="8538" totalAnnotated="13" xsi:schemaLocation="../schema/lexUnit.xsd">
    <header>
        <corpus description="BNC2" name="BNC2" ID="111">
            <document description="bncp" name="bncp" ID="421"/>
        </corpus>
        <frame>
            <FE fgColor="FFFFFF" bgColor="FF69B4" type="Core" abbrev="Data" name="Data"/>
            <FE fgColor="FFFFFF" bgColor="004C99" type="Peripheral" abbrev="Quantity" name="Quantity"/>
            <FE fgColor="FFFFFF" bgColor="0000FF" type="Peripheral" abbrev="Origin" name="Origin"/>
            <FE fgColor="FFFFFF" bgColor="99004C" type="Peripheral" abbrev="Characteristic" name="Characteristic"/>
            <FE fgColor="FFFFFF" bgColor="009900" type="Peripheral" abbrev="Name" name="Name"/>
        </frame>
    </header>
    <definition>This is a frame for representing Data and their properties.
This frame represents Data, the Quantity or dimensions  associated with given data (e.g, a number of datasets, number of features), identifies the Origin of data, its Characteristic, its Name  (e.g., of a partic-
ular dataset).</definition>
    <lexeme POS="N" name="data"/>
 
 
    <subCorpus name="01-unannotated-1">
        <sentence corpID="111" docID="421" sentNo="1" paragNo="181" aPos="0" ID="1215438">
            <text>If we are given, e.g., a cost proportion of c=0.725, and we only have ten examples in our data set, the rate 0.725 cannot be achieved with a single split point.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-013-5328-9"/>
                    <label cBy="351" end="81" start="73" name="Target"/>
                    <label end="97" start="90" name="Target"/>
                </layer>
                <layer rank="1" name="FE"><label end="81" start="81" name="Quantity"/>
                <label end="97" start="90" name="Data"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="2" paragNo="181" aPos="0" ID="1215438">
            <text> We generated 10 data sets simulating from each of the four models for all combinations of n=100,400, and d=2,10, for a total of 160 distinct data sets.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-013-5388-x"/>
                    <label cBy="351" end="25" start="17" name="Target"/>
                    <label end="150" start="142" name="Target"/>
                </layer>
                <layer rank="1" name="FE"><label end="15" start="14" name="Quantity"/>
                <label end="25" start="17" name="Data"/>
                <label end="111" start="27" name="Origin"/>
                <label end="140" start="129" name="Quantity"/>
                <label end="150" start="142" name="Data"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="3" paragNo="181" aPos="0" ID="1215438">
            <text>Most of the learning-to-rank methods had been tested on (and tuned to) the relatively small LETOR data sets, published by Microsoft.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/978-3-642-23780-5_27"/>
                    <label cBy="351" end="106" start="98" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="90" start="86" name="Characteristic"/>
                    <label end="96" start="92" name="Name"/>
                    <label end="106" start="98" name="Data"/>
                    <label end="130" start="122" name="Origin"/>
 
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="4" paragNo="181" aPos="0" ID="1215438">
            <text>We select a suitable value for  in a preliminary experiment for each data set (i.e., Ailerons =218, Elevators =219, Pole Telecomm =212, Pumadyn =221) and use this value in the actual experiment.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-012-5287-6"/>
                    <label cBy="351" end="76" start="69" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="128" start="121" name="Name"/>
                    <label end="92" start="85" name="Name"/>
                    <label end="76" start="69" name="Data"/>
                    <label end="108" start="100" name="Name"/>
 
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="5" paragNo="181" aPos="0" ID="1215438">
            <text>We note that the extreme sparsity of this data set makes the prediction problem extremely difficult.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-014-5444-1"/>
                    <label cBy="351" end="49" start="42" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="32" start="25" name="Characteristic"/>
                    <label end="49" start="42" name="Data"/>
 
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="6" paragNo="181" aPos="0" ID="1215438">
            <text>Learning linear models with cross-product features may pose scalability issues for high-dimensional data sets with large number of tags.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-013-5371-6"/>
                    <label cBy="351" end="108" start="100" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="98" start="83" name="Characteristic"/>
                    <label end="108" start="100" name="Data"/>
                    <label end="134" start="110" name="Characteristic"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="7" paragNo="181" aPos="0" ID="1215438">
            <text>In this data set, 741 out of 1057 are annotated by 5 doctors and each sample has 323 features.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-013-5412-1"/>
                    <label cBy="351" end="15" start="8" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="20" start="18" name="Quantity"/>
                    <label end="15" start="8" name="Data"/>
                    <label end="32" start="29" name="Quantity"/>
                    <label end="59" start="38" name="Origin"/>
                    <label end="92" start="77" name="Quantity"/>
                    <label end="75" start="70" name="Data"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="8" paragNo="181" aPos="0" ID="1215438">
            <text>The focus of this paper is on joint feature re-extraction and classification in cases when the training data set is small.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-007-5039-1"/>
                    <label cBy="351" end="102" start="95" name="Target"/>
                    <label cBy="351" end="111" start="104" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="120" start="116" name="Characteristic"/>
                    <label end="111" start="95" name="Data"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="9" paragNo="181" aPos="0" ID="1215438">
            <text>The three experiments reported here employed synthetic data sets, constructed so as to have the precise properties required to test specific hypotheses.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1023/A:1022609119415"/>
                    <label cBy="351" end="63" start="55" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="53" start="45" name="Characteristic"/>
                    <label end="63" start="55" name="Data"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="10" paragNo="181" aPos="0" ID="1215438">
            <text>The processed sound sample is represented by a 3-dimensional feature vector.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-012-5297-4"/>
                    <label cBy="351" end="25" start="20" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="59" start="47" name="Quantity"/>
                    <label end="25" start="20" name="Data"/>
                    <label end="74" start="61" name="Data"/>
 
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="11" paragNo="181" aPos="0" ID="1215438">
            <text>Our experimental results demonstrate that using prior knowledge about the structure, even with hidden variables, can significantly improve the learning rate of probabilistic networks.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1023/A:1007421730016"/>
                    <label cBy="351" end="62" start="48" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="110" start="85" name="Characteristic"/>
                    <label end="82" start="48" name="Data"/>
                    <label end="181" start="160" name="Data"/>
 
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="11" paragNo="181" aPos="0" ID="1215438">
            <text>Our experimental results demonstrate that using prior knowledge about the structure, even with hidden variables, can significantly improve the learning rate of probabilistic networks.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1023/A:1007421730016"/>
                    <label cBy="351" end="62" start="48" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="110" start="85" name="Characteristic"/>
                    <label end="82" start="48" name="Data"/>
                    <label end="181" start="160" name="Data"/>
 
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="12" paragNo="181" aPos="0" ID="1215438">
            <text>This paper explores the incorporation of prior knowledge in support vector regresion by the addition of constraints.</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-007-5035-5"/>
                    <label cBy="351" end="56" start="42" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="56" start="42" name="Data"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
        <sentence corpID="111" docID="421" sentNo="13" paragNo="181" aPos="0" ID="1215438">
            <text>In the testing set we introduce missing values at random with a fixed probability \(p=0.5\).</text>
            <annotationSet cDate="05/02/2003 04:00:38 PDT Fri" status="MANUAL" ID="1877967">
                <layer rank="1" name="Target">
                    <label end="0" start="0" name="10.1007/s10994-014-5450-3"/>
                    <label cBy="351" end="45" start="32" name="Target"/>
                </layer>
                <layer rank="1" name="FE">
                    <label end="45" start="32" name="Data"/>
                    <label end="55" start="46" name="Characteristic"/>
                    <label end="90" start="62" name="Quantity"/>
            </layer>
                <layer rank="1" name="GF"/>
                <layer rank="1" name="PT"/>
                <layer rank="1" name="Other"/>
                <layer rank="1" name="Sent"/>
                <layer rank="1" name="Verb"/>
            </annotationSet>
        </sentence>
 
    </subCorpus>
</lexUnit>
data_corpus.txt · Last modified: 2016/03/17 13:07 by pj