What is the reverence between lda [doc_bow] and lda.inference (corpus)?

In the LDA model, these are two methods of outputting new documents using the existing model I think. what are the differences between these two methods?

+3


source to share


1 answer


I ran some tests where my ldamodel has 8 themes and here are my results: 2 docs for predicting a theme:

list_unseenTw=[['hope', 'miley', 'blow', 'peopl', 'mind', 'tonight', 'gain', 'million', 'fan'],['@mileycyrustour', "we'r", 'think', "it'", 'pretti', 'cool', 'miley', 'saturday', 'night', 'live', 'tonight', '#prettycool']]

      

  • Forecast with lda [doc_bow] (it already gives a percentage of the corresponding topic)

    doc_bow = [dictionary.doc2bow (text) for text in list_unseenTw] projections = ldamodel [doc_bow]

    predictions [0]: [(0, 0.02509002728802024), (1, 0.0250114373070437), (2, 0.025040162139306051), (3, 0.82462688228515812), (4, 0.025150924341817767), (5, 0.025000027675139792), (6, 0.025000024127660267), (7, 0.025080514835853926)]

    predictions [1]: [(0, 0.031250011319462589), (1, 0.031250013721820222), (2, 0.031250019639505598), (3, 0.031250015093378707), (4, 0.031250019670816337), (5, 0.0312500248607396 , 0.78124988084026048), (7, 0.031250014854016454)]

  • Prediction with ldamodel.inference (results are weight, not percent)

    pred = ldamodel.inference (doc_bow)

    print (prev)

    (array ([[0.12545023, 0.1250572, 0.12520085, 4.12309694, 0.12579184, 0.12500014, 0.12500012, 0.12540268], [0.12500005, 0.12500005, 0.12500008, 0.12500006, 0.12500008, 0.1250001, 3.12499952, 0.12500006)]])



As you can see, the result for the first prediction (doc1) is the same (topic 3) as you:

total=0

for i in pred[0][0]:

        total+=i

4.12309694/total = 0.82462%

      

0


source







All Articles