Home Parsing stops with Akka Streams mapAsync
Reply: 0

Parsing stops with Akka Streams mapAsync

user6719
1#
user6719 Published in September 21, 2018, 8:02 am

I am parsing 50000 records which contain their titles and URLs on the web page. While parsing, I am writing them to the database, which is PostgreSQL. I deployed my application using docker-compose. However, it keeps stopping on some page without any reason. I tried to write some logs to figure out what's happening, but there is no connection error or anything like that.

Here is my code for parsing and writing to the database:

object App {
  val db = Database.forURL("jdbc:postgresql://db:5432/toloka?user=user&password=password")
  val browser = JsoupBrowser()
  val catRepo = new CategoryRepo(db)
  val torrentRepo = new TorrentRepo(db)
  val torrentForParseRepo = new TorrentForParseRepo(db)
  val parallelismFactor = 10
  val groupFactor = 10
  implicit val system = ActorSystem("TolokaParser")
  implicit val materializer = ActorMaterializer()
  implicit val executionContext = system.dispatcher

def parseAndWriteTorrentsForParseToDb(doc: App.browser.DocumentType) = {
    Source(getRecordsLists(doc))
      .grouped(groupFactor)
      .mapAsync(parallelismFactor) { torrentForParse: Seq[TorrentForParse] =>
        torrentForParseRepo.createInBatch(torrentForParse)
      }
      .runWith(Sink.ignore)
  }

 def getRecordsLists(doc: App.browser.DocumentType) = {
    val pages = generatePagesFromHomePage(doc)
    println("torrent links generated")
    println(pages.size)
    val result = for {
      page <- pages
    } yield {
      println(s"Parsing torrent list...$page")
      val tmp = getTitlesAndLinksTuple(getTitlesList(browser.get(page)), getLinksList(browser.get(page)))
      println(tmp.size)
      tmp
    }
    println("torrent links and names tupled")
    result flatten
  }

}

What may be the cause of such problems?

share|improve this question
  • From the .runWith(Sink.ignore), you got a Future. Do you handle if there is any exception? – Luca T. Feb 13 at 20:01

1 Answer 1

active oldest votes
up vote 0 down vote
You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.403013 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO