[Abstract]

  • novel Graph-based neural Architecture Encoding Schems (GATES)
  • can encode “operation on node” and “operation on edge” consistently

[1. Introduction]

  • two key components in a NAS framework
    1. architecture searching module
    2. architecture evaluation module
      • training every candidate architecture until convergence (thousands of architectures)
      • computational burden is large

      ⇒ directions

      1) evaluation: accelerate the evaluation of each architeucture (w.r.t. ranking correlation)

      2) searching: increasing sample efficiency so that fewer architectures are needed to evaluated

      • solution: learn approximated performance predictor → utilize the predictor to sample potentially good architectures
      • performance predictor predicts performance based on encodings
        • sequence-based scheme
        • graph-based scheme: GCN
          • for encoding operations, it is more natural to encoding them as transforms of node attrbutes (mimic the processing of the information) than just node attributes
  • introduce GATES
    • models infornation as the attributes of input nodes
    • data processing of the operations are modeled by GATES as different transforms

    ⇒ embeddings of isomorphic architectures are the same

[2. Related Work]

[2.1 Architeture Evaluation Module]

  • parameter sharing: construct supernet → one-shot
    • effective, but not accurate and not generally applicable

[2.2 Architecture Searching Module]

  • RL based, Evolutionary methods, gradient-based method, MCTS method
  • utilize a performance predictor to samplem new architectures
  • NASBot, NAO

[2.3 Neural Architecture Encoders]

  • sequence-basd : lose of information
  • graph-based: hard to encode OOE (concatenation cannot be generalized)

[3. Method]

[3.1 Predictor-Based Neural Architecture Search]

Untitled

[3.2 GATES: A Generic Neural Architecture Encoder]

$\hat{s} = P(a)=MLP(Enc(a))$

Untitled

Untitled

Untitled

Untitled

  • difference between GATES and GCN
    • GATES models operations as the processing of the node attributes
    • GCN models them as the node attributes themselves
  • representational power

    1) reasonable modeling of DAGs

    2) intrinsic power handling of DAG isomorphism

[3.3 Neural Architecture Search Utilizing the Predictor]

  • random sample n architectures → select the best k
  • search with EA for n steps → select the best k

[4. Experiments]

[4.2 Predictor Evaluation on NAS-Bench-201]

  • 15625 architectures in OOE search space
    • the first 50%: training data
    • remaning: test data

      Untitled