Japanese Grammar Pdf 98283

Partial capture of text on file.
                                                                     Analysts  Grammar  or  Japanese  tn  the Nu-ProJect 
                                                                       -  A  Procedural  Approach  to  Analysts  Grammar  - 
                                                                  Jun-tcht  TSUJII.  Jun-tcht  NAKANURA and  Nakoto NAGAO 
                                                                             Department of  Electrical  Engineering 
                                                                                              Kyoto University 
                                                                                                 Kyoto.  JAPAN 
                  Abstract                                                                                     CFG  rules         Independently  describe  constraints  on 
                                                                                                               stngle  linguistic            structures,  and  a  universal                  rule 
                           Analysts  grammar of  Japanese  tn  the  Mu-proJect                                 application  mechanism automatically  produces  a  set 
                   ts    presented,             It       is      emphasized  that  rules                       of  posstble  structures  which  satisfy                            the  given 
                  expressing            constraints            on      stngle         linguistic               constraints.            It  ts  well-known,  however,  that  such 
                   structures  and  rules                   for     selecting           the  most              sets        of      posstble          structures            often          become 
                   preferable  readtngs  are  completely  different                                In          unmanageably large. 
                   nature,  and  that  rules  for  selecting  preferale                                                 Because  two  separate  rules  such  as 
                   readings  should  be  utilized  tn  analysts  grammars of 
                   practical  HT  systems.                  It    ts     also  clatmed  that 
                   procedural  control  ts  essential  tn  integrating  such                                                       NP  .....    •   NP  PREP-P 
                   rules  tnto  a  unified  grammar.                     Some sample  rules                                        VP  .....    •   VP  PREP-P 
                   are  gtven  to  make  the  points  of  discussion  clear                                     are  usually  prepared  tn  CFG  grammars  tn  order  to 
                   and  concrete.                                                                              analyze  noun              and       verb       phrases          modifted  by 
                   1.  Introduction                                                                             prepositional  phrases.  CFG  grammars  provide  two 
                                                                                                                syntactic  analyses  for 
                           The  Hu-ProJect  ts  a  Japanese  nattonal  project                                                     She  was  given  flowers  by  her  uncle. 
                   supported  by  grants  from  the  Special  Coordination 
                   Funds  for  Promoting                  Science  &            Technology  of                  Furthermore.  the  ambiguity  of  the  sentence  ts 
                  STA(Sctence  and  Technology  Agency).  whlch  atms  to                                       doubled  by  the  lexlcal  ambiguity of  "by".  which  can 
                   develop         Japanese-English               and       English-Japanese                    be    read  as  etther  a  locattve  or  an  agenttve 
                  machine  translation  systems.  Ve  currently  restrict                                       preposition.           Since  the  two  syntactic  structures 
                   the  domain            of      translation            to      abstracts  of                  are  recognized  by  compZetely  independent  ru]es  and 
                   scientific        and  technological  papers.  The  systems                                  the  semantic  interpretations  of  "by"  are  given  by 
                   are  based  on  the  transfer  approach[;],  and  consist                                    independent processes  tn  the  ]ater  stages. It  ts 
                   of  three  phases:  analysts,  transfer  and  generation.                                    difficult       to  compare  these  four  readings  during  the 
                   In  thts  paper,  we  focus  on  the  analysts  grammar  of                                  anaZysts  to  gtve  a  preference  to  one  of  these  four 
                  Japanese  tn  the  Japanese-English  system.                                    The           readings. 
                   grammar  has  been  developed by  using  GRADE which                             ts 
                   a  programming  language  specially  designed  for                           thts            A  rule  such  as 
                   project[2].          The  grammar  now consists  of about  900 
                   GRADE rules.           The  experiments  so  far  show  that                   the                "If     a    sentence  ts  passlve  and  there  ts  a 
                   grammar works  very well  and  ts  comprehensive enough                                      "by"-prepostttonal  phrase,                   tt     ts    often  the  case 
                   to  treat  various  linguistic                phenomena tn  abstracts.                       that  the  prepositional                   phrase  ftlls            the       deep 
                   In  thts  paper  we  wtll              discuss  some  of  the  basic                         agenttve case.  (try  thts  ana]ysts  first)" 
                   design  principles  of  the  grammar  together wtth  its 
                   detatled  construction.                  Some examples  of  grammar                          seems  reasonable and  quite  useful  for  choosing  the 
                   rules  and  analysts  results  wtll                     be  shown       to  make             most  preferable            interpretation,             but      tt  cannot  be 
                   the  points  of  our  discussion  clear  and  concrete.                                      expressed by  refining  the  ordinary  CFG  rules.                            Thts 
                                                                                                                ktnd  of  ru]e         ts  quite        different  In  nature  from  a 
                   2.  Procedural  Grammar                                                                      CFG  ru]e.          It     ts    not  a  rule  of  constraint  on  a 
                                                                                                                stng]e  ]tngutsttc  structure(in  fact.  the  above four 
                             There  has  been  a  prominent  tendency  tn  recent                               readings  are  a]l            ]tngulsttcal]y           posstb]e),         but  tt 
                   computational  linguistics                  to     re-evaluate  CFG  and                     ts  a  "heuristic"          ru]e  concerned with  preference  of 
                   use     tt      dtrectly          or       augment         tt     to  analyze                readings,  which                compares          several         alternative 
                   sentences[3.4.5].                In    these  systems(frameworks),                           analysts  paths  and  chooses  the  most  feastble  one. 
                                                                                                                Human translaters  (or  humans  tn  general)  have  many 
                                                                                                        267 
                      such  preference  rules  based on  vartous  sorts  of  cue 
                      such  as  morphological  forms  of  words,   collocations        3  Organization  of  Grammar 
                      of  words,  text  styles,  word  semantics,  etc.    These 
                      heuristic  rules  are  quite  useful  not  only  for                   In  thts  sectton,  we  will  give  the  organization 
                      increasing  efficiency  but  also  for         preventing        of  the  grammar  necessary  for  understanding  the 
                      proliferation  of  analysts  results.        As   Wllks[6]       discuss|on  |n  the  follow|ng  sections.         The  matn 
                      potnted out,  we  cannot  use  semanttc  Information  as         components  of  the  grammar  are  as  follows. 
                      constraints  on  stngle  linguistic       structures,  but 
                      Just  as  preference  cues  to  choose  the  most  feastble 
                      Interpretations      among  linguistically        posstble       (1)  Post-Morphological  Analysts 
                      Interpretations.      We  clatm  that  many  sorts  of           (2)  Determination  of  Scopes 
                      preference  cues  other  than  semanttc  ones  exist  tn         (3)  Analysts  of  Stmple  Noun  Phrases 
                      real  texts  whtch  cannot  be  captured  by  CFG  rules.        (4)  Analysts  of  Stmple  Sentences 
                      We  will  show  tn  thts  paper  that.  by  utilizing            (5)   Analysts  of     Embedded Sentences        (Relative 
                      vartous  sorts  of  preference  cues.  our  analysts             Clauses) 
                      grammar     of     Japanese      can     work       almost       (6)  Analysts  of  Relationships  of  SentenCes 
                      determtntsttcally  to  gtve  the  most         preferable        (7)  Analysts  of  Outer  Cases 
                      Interpretation  as  the  ftrst     output,  wtthout  any         (8)  Contextual  Processing  (Processing  of  Omttted 
                     extensive  semanttc  processing  (note  that           even       case  elements.  Interpretation  of  'Ha'  .  etc.) 
                      "semant|c"  processing  cannot  dtsambtguate  the  above         (9)  Reduction  of  Structures  for  Transfer  Phase 
                      sentence.    The  four  readings  are        semantically 
                     possible.     It   requtres     deep  understanding      of              Each  component  conststs  of  from  60  to  120 
                     contexts  or  situations,  whtch  we  cannot  expect  tn  a       GRADE rules. 
                     practical  MT  system). 
                           In  order  to  Integrate  heuristic  rules  based on              47  morpho-syntacttc  categories  are  provtded 
                     var|ous  levels  of  cues  tnto  a  untfted  analysts             for  Japanese  analysts,  each  of  whtch  has  tts  own 
                     grammar,  we  have  developed a  programming  langauage.          lextcal  description  format.     12.000  lextcal  entrtes 
                     GRADE.  GRADE provtdes  us  wtth  the            following        have  already  been      prepared  according       to  the 
                     facilities.                                                       formats.    In  thts   classification.     Japanese nouns 
                                                                                       are  categorized  |nto  8  sub-classes  according  to 
                            Expllctt  Control  of  Rule  Appl|cattons           :      thetr  morpho-syntacttc  behavtour,  and  53  semanttc 
                     Heuristic  rules  can  be  ordered  according  to  thetr          markers  are  used  to  characterize  thetr  semanttc 
                      strength(See  4-2).                                              behaviour.  Each  verb  has  a  set  of  case  frame 
                                                                                       descriptions  (CFD)  whtch  correspond  to  different 
                        -  Nulttple  Relatton  Representation  :  Vartous              usages  of   the  verb.     A  CFD  g|ves mapping  rules 
                     levels  of  Informer|on  Including         morphological.         between  surface  case  markers  (SCN -    postpostttonal 
                     syntactic,  semantic,  logtcal  etc.  are  expressed  tn          case  particles  are  used  as  SCN's  tn  Japanese)  and 
                     a  s|ngle  annotated  tree  and  can  be  manipulated  at         thetr  deep  case  interpretations  (DCZ           33  deep 
                     any  ttme  durtng  the  analysts.  Thts  ts  requtred  not        cases  are  used).    DC!  of  an  SCM  often  depends  on 
                     only  because  many  heuristic  rules  are  based on              verbs  so  that  the  mapping  rules  are  given  %o  CFD's 
                     heterogeneous  levels  of  cues.  but  also  because  the         of  Individual  verbs.      A  CFO  also  gtves  a  normal 
                     analysts  grammar  should  perform  semantic/logical              collocation       between         the       verb        and 
                     Interpretation  of  sentences  at  the  same ttme  and            SCM's(postpositonal  case       particles).       Oetatled 
                     the  rules  for  these  phases  should  be  wrttten  tn  the      lextcal  descriptions  are  gtven  and  discussed  tn 
                     same framework  as  syntactic  analysis  rules  (See              another  paper[7]. 
                     4-2.  4-4).                                                             The  analysts  results  are  dependency  trees 
                        -  Lextcon  Drtven  Processing  :        We   can  wrtte       whtch  show  the  semanttc  relationships  among tnput 
                     heuristic  rules  spectftc  to  a  stngle  or  a  11mtted         words. 
                     number  of  words  such  as  rules  concerned  wtth 
                     collocations  among words.       These  rules  are  strong 
                      tn  the  sense  that  they  almost  always  succeed.  They       4.  Typtcal  Steps  of  Analysts  Grammar 
                     are  stored      tn   the    lextcon     and   tnvoked  at 
                     appropriate  times  durtng  the  analysts          wtthout              In  the  following,  we  w111 take  some sample 
                     decreasing efficiency  (See  4-1).                                rules  to  Illustrate  our  points  of  discussion. 
                        -  Expltct%  Definition  of  Analysts  Strategies  :           4-;  Relative  Clauses 
                     The  whole  analysts  phase  can  be  dtvtded  into  steps. 
                     Thts  makes the  whole  grammar  efficient,  natural  and                Relative  clause  constructions  in  Japanese 
                     easy  %o  read.  Furthermore.  strategic  consideration           express  several  different      relationships  between 
                     plays  an  essential    role  tn  preventing undesirable          modifying  clauses  (relative       clauses)  and  thelr 
                      interpretations  from  betng  generated  (See  4-3).             antecedents.  Some relattve  clause  constructions 
                                                                                 268 
              cannot  be  translated        as   relative      clauses    tn      [ex-1]  [Type  2] 
              Engltsh.    Me  classified    Japanese  relattve  clauses           "SHORZSOKUDO"            "GA"        "HAYA["  "KEISANK[" 
              Into  the  followtn 9  four  types,  according  to  the             (processing  speed)  (case           (htgh)  I (computer) I 
              relationships       between       clauses       and     their                              particle: 
              antecedents.                                                                               subject 
                (1)  Type  1  :  Gaps  In  Cases                                  I                      case)                /t 
                    One  of  the  case  elements  of  the  relattve                          RelattvetClause                     Antecedent 
              clause  ts  deleted  and  the  antecedent  fills     the  gap.      -->(English  Translation) 
                (2)  Type  2  :  Gaps  In  Case  Elements                              A  computer whose processing  speed  ts  htgh 
                    The  antecedent modifies  a  case  element  tn       the      (Rule  3)     Nouns    such     as   "MOKUTEKZ"(puPpose). 
              clause.    That  ts.  a  gap  exists  tn  a  noun  phrase   tn      "GEN ZN"(reason),       "SHUDAN"(method)  etc.        express 
              the  clause.                                                        deep case  relationships  by  themselves,  and.  when 
                                                                                  these  nouns  appear as  antecedents.  |t  is  often  the 
                (3)  Type  3  :  Apposition                                       case  that  they  ft11    the  gaps  of   the  corresponding 
                                                                                  deep cases  tn  the  relattve  clauses. 
                     The  clause  describes  the  content  of  the                [ex-2]  [Type  1] 
              antecedent as  the  Engltsh  "that"-clause  tn  'the 
              tdea  that  the  earth  ts  round'.                                 "KONO" "SOUCHI"  "O"         "TSUKAT" "TA"         "MOKUTEK[" 
                (4)  Type  4  :  Partlal  Apposltlon                              (th,s)l(dev,c.  (c..                               ICpurpos.) 
                                                                                                    |part,cle:h        /,ormat,ve: I       J 
                    The  antecedent and  the  clause  are  related  by             I                / °bJect  l        /   pest)     l 
              certain  semantic/pragmatic         relationships.         The                        /case)    ~                    / 
              relative  clause  of  thts  type  doesn't  have  any  gaps.                     RelattvetClause                         Antecedent 
              This   type  cannot  be  translated  dtrectly            lnto 
              English  relative  clauses.       Me  have  to  Interpolate         -->  (English  Translation) 
              In  English  appropriate  phrases  or  clauses  whtch  are 
              Implicit  tn  Japanese.      tn   order  to  express  the                The  purpose  for  wh|ch  (someone)  used  thts  devtce 
              semantic/pragmatic       relationships       between       the           The  purpose  of  ustn9  thts  devtce 
              antecedents  and  relative       clauses  explicitly.       In 
             other  words,  gaps  extst  tn  the  Interpolated  phrases 
              or  clauses.                                                        (Rule  4)  There  ts  a   11mtted  number  of  nouns  whtch 
                    Because  the  above  four  types  of  relattve                are  often  used  as  antecedents  In  Type  4  relattve 
              clauses  have  the  same surface  forms  fn  Japanese               clauses.    Each  of    such  nouns   requtres  a  specific 
                                                                                  phrase  or  clause  to  be  Interpolated  tn  Engltsh. 
                   .........    (verb)  (noun).                                   [ex-3]  [Type  4] 
                    RelattvefClause       Antecedent                              "KONO"  "SOUCHI"  "0"           "TSUKAT"--  "TA"       "KEKKA" 
             careful  processing  ts  requtred  to  d|sttngutsh        them       (th,s),(devlce)/~case e.~. (to use)/~tense ~'...(;esult) 
              (note  that  the   "antecedents'  -modified  nouns-  ape                                 ...l                 fformat,ve:h J 
              located  after  the  relat|ve  clauses  tn  Japanese).  A            1                 ,object  ,             Ipast)        I  1 
              sophisticated  analysis  procedure  has  already  been               [                 I case)   l 
              developed,  which  fully     ut|ltzes  vartous  levels  of                          Rel at tve ~ Clause                Antecedent 
             heuristic  cues  as  follows. 
              (Rule  1)  There  are  a  11mtted  number  of  nouns  whtch         -->  (Engllsh  Translation) 
               are  often  used  as  antecedents  of  Type  3  clauses. 
              (Rule  2)  Vhen  nouns  with  certa|n  semanttc  markers             The  result  which  was  obtatned  by  ustng  thts  dev|ce 
              appear  tn  the  relattve  clauses  and  those  nouns  are 
              followed by  one  of  spectflc  postpostttonal  case                  In  the  above  example,  the  clause  "the  result  whtch 
              part4cles,  there  ts  a  htgh  possibility  that  the              someone  obtatned  (the  result  :  gap)"  ts  onmitted  tn 
              relattve  clauses  are  Type  2.         In   the  following        Japanese.       whtch      relates       the       antecedent 
              example,  the  word  "SHORISOKUDO"(processtn 9  speed)              "KEKKA"(result)  and  the  relattve  clause              "KONO 
              has  the  semanttc marker AO  (attribute).                          SOUCHI  0  TSUKAT_TA"(someone  used  thts  devtce). 
                                                                             269 
                                             A        set         of         lextcal              rules            ts      defined  for                        (Rule  1)  Stnce  parttcle  "TO"  ts  also  used  as  a  case 
                                         "KEKKA"(resulL).  which  basically  works  as  follows  :                                                             particle,  tf  It  appears  tn  the  position: 
                                         tt  examines  first  whether  the  deep object  case  has 
                                         already  been  filled                           by  a  noun  phrase  tn  the                                                    Noun  'TO"  verb                                 Noun, 
                                         relattve  clause.                       If      so,        the  relattve  clause  ts                                            Noun  'TO'  adjective  Noun. 
                                           taken  as  type  4  and  an  appropriate phrase  ts 
                                         Interpolated  as  tn                      [ex-3].            If      not,  the  relattve 
                                         clause  ts             taken  as  type  1  as  tn  the  following                                                     there  are  two  posstble  Interpretations.                                                  one  tn 
                                         example where  the  noun                           *KEKKA" (result)  ftlls                           the             whlch  "TO"  Is  a  case  parttcle  and  "noun                                                          TO 
                                         gap  of  object  case  tn  the  relattve  clause.                                                                     adjective(verb)'                      forms         a      relattve  clause                        that 
                                                                                                                                                              modifies  the  second  noun.  and  the  other  one  tn 
                                         [ex-4]  [Type  1]                                                                                                    which  "TO"               ts      a      conjunctive  particle                            to  form  a 
                                                                                                                                                               conJuncted  noun  phrase.  However.  it  ts  very  11kely 
                                         "KONO"  "JIKKEN •                       /   •GA".  "TSUKAT•  J"TA" l  "KEKKA"                                         that  the  parttcle                         'TO'         ts      not  8             conjunctive 
                                         (thts)J(expertment)//(case~(to  use)~(tense (r~ult)                                                                  parttcle  but  a  post-positional  case  particle,  if 
                                                                               rParticle~                            iformsttve:]l                             the  adjective  (verb)  ts  one  of  adjectives  (verbs) 
                                                                               IsubJect  I                           I  past)|                  I             which  requtre  case  elements  wtth  surface  case  mark 
                                         [                              _ll  case)  l                                              /           I               "TO'  and  there  are  no  extra words  between  "TO •                                               end 
                                                                                                                                                              the  adjective  (verb).                              In       the  following  example. 
                                                            Relattve Clause                                                      Antecedent                    "KOTONARU(to be  different)"                                 ts  an         adjective which 
                                                                                                                                                               ts  often  collocated wtth  a  noun  phrase  followed  by 
                                         -->(English  Translation)                                                                                            case  particle  "TO". 
                                               The  result  whtch  thts  experiment used                                                                        [ex-5] 
                                                                                                                                                                           YOSOKU-CHI                    "TO"               KOTONARU                     ATAI 
                                                                                                                                                                   (predicted value)                           (to  be  different)                     (value) 
                                          Such  lextcal  rules  are  Invoked  at  the  beginning of 
                                         the  relattve  clause  analysts  by  a  rule  tn  the  math                                                                 [dominant  interpretation] 
                                         flow  of  processing.  The  noun  "KEKKA •  (result)                                                   is 
                                         given  a  mark  as  a  lexlcal  property which                                          Indicates                               IYOSOKU-CHI                     "TO"               KOTONARU ATIAI 
                                         the  noun  has               special  rules  to  be  Invoked when tt 
                                         appears  as  an  antecedent of  a  relatlve  clause.                                                 A11                                           relattve~clause                                      ant/cedent 
                                         the  nouns  which  requlre  speclal  treatments  In                                                  the 
                                         relative  clause  analysts  are  given  the  same marker.                                                                         •  the  value which  ts  different  from  the 
                                        The  rule  tn  the  matn  flow  only  checks  thts  mark  and                                                                        predicted value 
                                         Invokes  the  lextcal  rules  defined  tn  the  lextcon. 
                                                                                                                                                                    [less  domtnant  Interpretation] 
                                         (Rule  5)  Only  the  cases  marked  by  postpostttonal 
                                         case  particles  'GA'.  'WO"  and  'NI"  can  be  deleted                                                                         YOSOKU-CHI                    "TO"                 KOTONARU  ATAI 
                                         tn  Type  1            relattve  clauses,  when the  antecedents 
                                        are  ordtnary  nouns.  Gaps tn  Type  1  relative  clauses                                                                               Me                                                         N~ 
                                         can  have  other  surface  case  marks,  only  when the                                                                                  I                                                           I 
                                         antecedents  are  spectal  nouns  such  as  described  tn                                                                                          conJuncte~ noun  phrase 
                                         Rule  (3). 
                                                                                                                                                                           =  the  predicted  value  and  the  different  value 
                                         4-2  ConJuncted  Noun Phrases 
                                                                                                                                                              (Rule  2)            If       two        "TO*  particles  appear  tn  the 
                                                         ConJuncted  noun  phrases  often  appear  in                                                         position: 
                                         abstracts  of  scientific                           and  technological  papers. 
                                         It      ts         Important                to        analyze  them  correctly.                                          Noun-1            'TO'        .  .........           Noun-2            'TO'  'NO"  NOUN-3 
                                         especially  to  determine  scopes  of  conjunctions 
                                         correctly,  because  they  often  lead  to  proliferation                                                             the  right  boundary of  the  scope  of  the  conJuctton 
                                         of      analysis  results.                          The  particle                   "TO"  plays                       ts  almost  always  Noun-2.                             The  second  'TO"                   plays  a 
                                         almost  the  same  role  as  the  Engllsh  "and"  to                                                                  role  of  a  delimiter  which  deltmtts  the  right 
                                         conjunct  noun  phrases.                          There  are  several  heuristic                                     boundary  of                    the          conjunction.                      Thts         'TO"        tS 
                                         rules  based  on  various  levels  of  information  to                                                               optional,  but  tn  real  texts  one  often  places  tt                                                 to 
                                         determine  the  scopes.                                                                                              make  the             scope  unambiguous,  especially  when the 
                                                                                                                                                              second  conjunct  IS  a  long  noun  phrase  and  the  scope 
                                                                                                                                                               is  highly  ambiguous  without  tt.                                  Because  the  second 
                                                                                                                                             a  delimiter  of  the  conjunction)  and  'NO'                                           following 
                                                                                                                                                              a  case  parttcle                     turns          the  preceding phrase  to  a 
                                                                                                                                                     270
The words contained in this file might help you see if this file matches what you are looking for:

...Analysts grammar or japanese tn the nu project a procedural approach to jun tcht tsujii nakanura and nakoto nagao department of electrical engineering kyoto university japan abstract cfg rules independently describe constraints on stngle linguistic structures universal rule mu application mechanism automatically produces set ts presented it is emphasized that posstble which satisfy given expressing well known however such for selecting most sets often become preferable readtngs are completely different in unmanageably large nature preferale because two separate as readings should be utilized grammars practical ht systems also clatmed control essential integrating np prep p tnto unified some sample vp gtven make points discussion clear usually prepared order concrete analyze noun verb phrases modifted by introduction prepositional provide syntactic analyses hu nattonal she was flowers her uncle supported grants from special coordination funds promoting science technology furthermore amb...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area